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What is Combinatorics? 


Combinatorics is a young field of mathematics, starting to be an independent 
branch only in the 20th century. However, combinatorial methods and problems 
have been around ever since. Many combinatorial problems look entertaining 
or aesthetically pleasing and indeed one can say that roots of combinatorics lie 
in mathematical recreations and games. Nonetheless, this field has grown to be 
of great importance in today’s world, not only because of its use for other fields 
like physical sciences, social sciences, biological sciences, information theory and 
computer science. 


Combinatorics is concerned with: 


e Arrangements of elements in a set into patterns satisfying specific rules, 
generally referred to as discrete structures. Here “discrete” (as opposed 
to continuous) typically also means finite, although we will consider some 
infinite structures as well. 


e The existence, enumeration, analysis and optimization of discrete struc- 
tures. 


e Interconnections, generalizations- and specialization-relations between sev- 
eral discrete structures. 


Existence: We want to arrange elements in a set into patterns satisfying 
certain rules. Is this possible? Under which conditions is it possible? 
What are necessary, what sufficient conditions? How do we find such 
an arrangement? 


Enumeration: Assume certain arrangements are possible. How many 
such arrangements exist? Can we say “there are at least this many”, 
“at most this many” or “exactly this many”? How do we generate all 
arrangements efficiently? 


Classification: Assume there are many arrangements. Do some of these 
arrangements differ from others in a particular way? Is there a natural 
partition of all arrangements into specific classes? 


Meta-Structure: Do the arrangements even carry a natural underlying 
structure, e.g., some ordering? When are two arrangements closer to 
each other or more similar than some other pair of arrangements? Are 
different classes of arrangements in a particular relation? 


Optimization: Assume some arrangements differ from others according 
to some measurement. Can we find or characterize the arrangements 
with maximum or minimum measure, i.e. the “best” or “worst” ar- 
rangements? 


Interconnections: Assume a discrete structure has some properties (num- 
ber of arrangements, ...) that match with another discrete structure. 
Can we specify a concrete connection between these structures? If 
this other structure is well-known, can we draw conclusions about our 
structure at hand? 


We will give some life to this abstract list of tasks in the context of the 
following example. 


Example (Dimer Problem). Consider a generalized chessboard of size m xn (m 
rows and n columns). We want to cover it perfectly with dominoes of size 2 x 1 
or with generalized dominoes — called polyominoes — of size k x 1. That means 
we want to put dominoes (or polyominoes) horizontally or vertically onto the 
board such that every square of the board is covered and no two dominoes (or 
polyominoes) overlap. A perfect covering is also called tiling. Consider Figure 
1 for an example. 


Figure 1: The 6 x 8 board can be tiled with 24 dominoes. The 5 x 5 board 
cannot be tiled with dominoes. 


Existence 


If you look at Figure 1, you may notice that whenever m and n are both odd 
(in the Figure they were both 5), then the board has an odd number of squares 
and a tiling with dominoes is not possible. If, on the other hand, m is even or 
n is even, a tiling can easily be found. We will generalize this observation for 
polyominoes: 


Claim. An m x n board can be tiled with polyominoes of size 1 x k if and only 
if k divides m or n. 


Proof. “<=” If k divides m, it is easy to construct a tiling: Just cover every 
column with m/k vertical polyominoes. Similarly, if k divides n, cover 
every row using n/k horizontal polyominoes. 


“>” Assume k divides neither m nor n (but note that k could still divide 
the product m+n). We need to show that no tiling is possible. We 
write m = s;k +71, n = s9k+ rg for appropriate s1,82,71,T2 € N and 
0<1r1,r2 < k. Without loss of generality, assume 7; < rp (the argument 
is similar if rg < r,). Consider the colouring of the m x n board with k 
colours as shown in Figure 2. 
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Figure 2: Our polyominoes have size k x 1. We use k colours (1 = white, k 
= black) to colour the m x n board (here: k = 6, m = 8, n = 9). Cutting 
the board at coordinates that are multiples of k divides the board into several 
chunks. All chunks have the same number of squares of each color, except for 
the bottom right chunk where there are more squares of color 1 (here: white) 
than of color 2 (here: light gray). 


a 


k 


ANOIKWNH 


Formally, the colour of the square (i, 7) is defined to be ((i—7j) mod k)+1. 
Any polyomino of size k x 1 that is placed on the board will cover exactly 
one square of each colour. However, there are more squares of colour 1 
than of colour 2, which shows that no tiling with k x 1 dominoes is possible. 


Indeed, for the number of squares coloured with 1 and 2 we have: 


# squares coloured with 1 = ks,s2 + syrg + Seri + 72 


# squares coloured with 2 = ks,s2 + s1rg + Ser1 + r2 — 1 O 


Now that the existence of tilings is answered for rectangular boards, we may 
be inclined to consider other types of boards as well: 


Claim (Mutilated Chessboard). The nxn board with bottom-left and top-right 
square removed (see Figure 3) cannot be tiled with (regular) dominoes. 


Figure 3: A “mutilated” 6 x 6 board. The missing corners have the same colour. 


Proof. If n is odd, then the total number of squares is odd and clearly no tiling 
can exist. If n is even, consider the usual chessboard-colouring: In it, the missing 
squares are of the same colour, say black. Since there was an equal number of 
black and white squares in the non-mutilated board, there are now two more 
white squares than black squares. Since dominoes always cover exactly one 
black and one white square, no tiling can exist. 


Other ways of pruning the board have been studied, but we will not consider 
them here. 


Enumeration 


A general formula to determine the number of ways an m x n board can be tiled 
with dominoes is known. For an 2m x 2n board the following formula is due to 
Temperly and Fisher [TI'61] and independently Kasteleyn [Kas61] 


m n 
mn 2 an 2 90 
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The special case of an 8 x 8 board is already non-trivial: 


Theorem (Fischer 1961). There are 2+- 177-53? = 12,988,816 ways to tile the 
8 x 8 board with dominoes. 


Classification 


Consider tilings of the 4 x 4 board with dominoes. For some of these tilings 
there is a vertical line through the board that does not cut through any domino. 
Call such a line a vertical cut. In the same way we define horizontal cuts. 

As it turns out, for every tiling of the 4 x 4 board at least one cut exists, 
possibly several (try this for yourself!). 

Hence the set 7 of all tilings can be partitioned into 


JT, ={T | T allows a horizontal cut but no vertical cut}, 
T2 = {T | T allows a vertical cut but no horizontal cut}, 
Tz = {T | T allows both a horizontal and a vertical cut}. 


Figure 4 shows one tiling for each of these three classes. 
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Figure 4: Some tilings have horizontal cuts, some have vertical cuts and some 
have both. 


As another example consider the board B consisting of two 4 x 4 boards 
joint by two extra squares as shown in Figure 5. 

In a tiling of B it is impossible that one extra square is covered by a domino 
containing a square from the left side, while the other extra square is covered by 
a domino containing a square from the right side: If we place the two dominoes 
as shown on the right in Figure 5, we are left with two disconnected parts of 
size 15. And since 15 is odd, this cannot work out. Hence the set 7 of all tilings 


Figure 5: The board B consisting of two 4 x 4 boards and two extra squares 
connecting them as shown. The partial covering on the right cannot be extended 
to a tiling. 


of B can be partitioned into 


JT, = {T | T matches both extra squares to the left} 
Tz = {T | T matches both extra squares to the right}. 


Meta Structure 


We say two tilings are adjacent, if one can be transformed into the other by 
taking two dominoes lying like this & and turning them by 90 degrees so they 
are lying like this Ill (or vice versa). Call this a turn operation. If we draw all 
tilings of the 4 x 4 board and then draw a line between tilings that are adjacent, 
we get the picture on the left of Figure 6. 

With this in mind, we can speak about the distance of two tilings, the number 
of turn operations required to transform one into the other. We could also look 
for a pair of tilings with maximum distance (this would be an optimization 
problem). 

We can even discover a deeper structure, but for this we need to identify 
different types of turn operations. We can turn =& into HM, call this an up-turn, 
or Ml into &, call this a side-turn. The turn can happen on the background 
or the background a (white squares form a falling or rising diagonal). We call 
an operation a flip if it is an up-turn on dor a side-turn on H. Call it a flop 
otherwise. 

Quite surprisingly, walking upward in the graph in Figure 6 always corre- 
sponds to flips and walking downwards corresponds to flops. This means that 
there is a natural partial ordering on the tilings: We can say a tiling B is greater 
than a tiling A if we can get from A to B by a sequence of flips. As it turns out, 
the order this gives has special properties. It is a so-called distributive lattice, 
in particular, there is a greatest tiling. 


Optimization 


Different tilings have a different set of decreasing free paths. Such a path (see 
Figure 7) proceeds monotonously from the top left corner of the board along 
the borders of squares to the bottom right, and does not “cut through” any 
domino. 

Some tilings admit more such paths then others. It is conjectured that in an 
m x n board the maximum number is always attained by one of the two tilings 
where all dominoes are oriented the same way (see right of Figure 7). But no 
proof has been found yet. 


Figure 6: On the left: The set of all tilings of the 4 x 4 board. Two tilings 
are connected by an edge if they can be transformed into one another by a 
single flip. On the right: The same picture, but with the chessboard still drawn 
beneath the tilings, so you can check that upward edges correspond to flips (as 
defined in the text). 


1 Permutations and Combinations 


Some people mockingly say that combinatorics is merely about counting things. 
In the section at hand we will do little to dispel this prejudice. 

We assume the reader is familiar with basic set theory and notions such as 
unions, intersections, Cartesian products and differences of two finite sets. 


1.1 Basic Counting Principles 


We list a few very simple and intuitive observations. Many of them where 
already used in the introduction. You may think that they are so easy that 
they do not even deserve a name. Still: If we take counting seriously, we should 
be aware of simple facts we implicitly use all the time. 


Figure 7: The green path does not cut through dominoes. We believe that the 
boring tiling on the right admits the maximum number of such paths, but this 
has not been proved for large boards. 


1.1.1 Addition Principle 


We say a finite set S is partitioned into parts S1,..., 5S, if the parts are disjoint 
and their union is S. In other words $;1S; = @ fori A j and S}US2U: - -US; = S. 
The addition principle claims that in this case 

|S] = $1] + [Sa] +++ + [Se]: 


Example. Let S be the set of students attending the combinatorics lecture. It 
can be partitioned into parts S; and Sj where 

S, = set of students that like easy examples. 

Sp = set of students that don’t like easy examples. 


If |S1| = 22 and |.S2| = 8 then we can conclude || = 30. 


1.1.2 Multiplication Principle 
If S is a finite set S that is the product of S1,..., 5%, i.e. S = S1 x So... x Sk, 


then the multiplication principle claims that 
|S] = [Si] x [52] x +++ x [Sm 
Example. Consider the license plates in Karlsruhe. They have the form 
KA - Iylo nyngng 


where 11, lz are letters from {A,...,Z} and ni, n2, 3 are digits from {0,..., 9}. 
However, Jz and n3 may be omitted. 
We count the number of possibilities for each position: 


K A - Ig ny ne ng 


ten Fe ae kt or. | 
1 1 1 2 27 10 10 11 


Here we modeled the case that lz is omitted by allowing an additional value L 
for lj that stands for “omitted”. The same holds for ns. 

Formally we could say that the set of all valid license plates is given as the 
product: 


{K}x{A}x{-}x{A,...,Z}x{A,..., 7, L}«{0,...,9}«{0,...,9}«{0,...,9, Lf. 


By the Multiplication Principle the size of this set is given as 1-1-1-26-27- 
10-10-11 = 772200. 
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1.1.3. Subtraction Principle 


Let S be a subset of a finite set T. We define S$ := T \ S, the complement of S 
in T. Then the substraction principle claims that 


|s| = |T|-[SI. 


Example. If T is the set of students studying at KIT and S' the set of students 
studying neither math nor computer science. If we know |T| = 23905 and 
|S| = 20178, then we can compute the number |S] of students studying either 
math or computer science: 


[S| = |T| — |S| = 23905 — 20178 = 3727. 


1.1.4 Bijection Principle 


If S and T are finite sets, then the bijection principle claims that 
|S| = |T| So there exists a bijection between S and T. 


Example. Let S be the set of students attending the lecture and T the set of 
homework submissions for the first problem sheet. 

If the number of students and the number of submissions coincide, then there 
is a a bijection between students and submissions! and vice versa. 


We now consider two other principles that are similarly intuitive and natural 
but will frequently and explicitly occur as patterns in proofs. 
1.1.5 Pigeonhole Principle 


Let S1,...,5m be finite sets that are pairwise disjoint and |S1| + |S] +...+ 
|Sin| =n. Then the pigeonhole principle claims that 


ie {l,...,m}: [Si] > [=] and 3j € {1,...,m}: |S; < |=]. 

m m 
Example. Assume there are 5 holes in the wall where pigeons nest. Say there 
is a set S; of pigeons nesting in hole i. Assume there are n = 17 pigeons in 
total. Then we know: 


e There is some hole with at least [17/5] = 4 pigeons. 


e There is some hole with at most |17/5| = 3 pigeons. 


1.1.6 Double counting 


If we count the same quantity in two different ways, then this gives us a (perhaps 
non-trivial) identity. This method often helps get control on overcounting. Let 
A and B be finite sets and consider a set S C Ax B. Then the method of double 
counting claims that 


S {ob € B| (a,b) € S}|=|S| = So l{a € A| (a,b) € 5}. 


acA beB 


Whether a “natural” correspondence can easily be determined is an entirely different 
matter: That would require that you cleanly write your name onto your submission. 
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Example (Handshaking Lemma). Assume there are n people at a party and 
everybody will shake hands with everybody else. What is the total number N of 
handshakes that occur? If we just sum up for each guest each of its handshakes, 
then we overcount each handshake. In order to control this overcount we shall 
instead count the number M of pairs (P,S) where P is a guest and S is a 
handshake involving P. We count this number in two ways: 


First way: There are n guests and everybody shakes n — 1 hands. So 


M= © |{(P,S)|S handshake with P}}= S> (n-1)=n-(n-1). 


P guest P guest 
Second way: Every handshake involves two guests. So 


Me S- |{(P, S') | P involved in S}| = S- 2-9-N. 


S handshake S handshake 


Together, we see that n(n — 1) = M = 2N and hence we obtain the identity 


n-(n—1) 
=p 


N= 


Example. In the example above we have seen a first way how two count the 
number N (already utilizing double counting). Here we count N in a second 
way. Label the guests from 1 to n. To avoid counting a handshake twice, we 
count for guest 7 only the handshakes with guests of smaller numbers. Then the 
total number of handshakes is 


n n-1 n-1 
yO -l= i= i 
i=1 i=0 t=1 
Hence we obtain another identity 
n-1 
n-(n—1) 
_——3 N — 
2 
w=1 


1.2 Ordered Arrangements — Strings, Maps and Products 


Define [n] := {1,...,n} to be the set of the first n natural number and let X be 
a finite set. 

An ordered arrangement of n elements of X is a map s: [n] > X. Ordered 
arrangements are so common that they occur in many situations and are known 
under different names. We might be inclined to say that “Banana” is a string (or 
word) and that “0815422372” is a sequence even though they are both ordered 
arrangements of characters. Depending on the circumstances different notation 
may be in order as well. 

We will introduce the most common perspectives on what an ordered ar- 
rangement is and the accompanying names and notation. Throughout, we use 
the example X = {0,0,1,A,V} and the ordered arrangement s of n = 7 ele- 
ments given as the table: 
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Be | a Be ee 
si)|O O O A OV V 


function: We call [n] the domain of s and s(z) the image of i under s (i € [n]). 
The set {x € X | s(¢) = x for some i € [n]} is the range of s. 


In the example, the domain of s is {1,2,3,4,5,6,7}, the image of 3 is O 
and the range of s is {0,0,A,V}. Note that | is not in the range of s. 


string: We call s an X-string of length n and write s = s(1)s(2)s(3)--- s(n). 
The i-th position (or character) in s is denoted by s; = s(t). The set X 
is an alphabet and its elements are letters. Often s is called a word. 


In the example, we would say that s = DOOAOVV is a string (or word) 
of length n over the five-letter alphabet X. The fourth character of s is 
84 = A. 


tuple: We can view s as an element of the n-fold Cartesian product X, x 
. x Xn, where X; = X for i € [n]. We call s a tuple and write it 

as ($1, 52,...,8n). The element s; is called the i-th coordinate (i € [n]). 
Viewing arrangements as elements of products makes it easy to restrict the 
number of allowed values for a particular coordinate (just choose X; ¢ X). 


In the example we would write s = (0,0,0,A,0,V, V). Its first coordinate 
is O. 


sequence: If X is a set of numbers, then s is often called a sequence and 
8; is the i-th position, term or element in the sequence. We would not 
use this terminology in the example above but could have, for instance, 
s(i) = 31+ 2/5. In that case the fourth term is equal to 12.4. 


In the following we will mostly view arrangements as functions but will freely 
switch perspective when appropriate. 


1.2.1 Permutations 


The most important ordered arrangements are those in which the mapping is 

injective, i.e., s(t) 4 s(j) fort Aj. 

Definition 1.1 (Permutation). Let X be a finite set. We define 

permutation: A permutation of X is a bijective map 7: [n] > X. Usually we 
choose X = [n] and denote the set of all permutations of [n] by Sy. 


We tend to write permutations as strings if n < 10, take for example 
m = 2713546 by which we mean the function: 


a (4 2 
mi) | 2 7 


3: 4 5 16> °F 

1 3 5 4 6 

k-Permutation: For 1 < k < |X| a k-permutation of X is an ordered arrange- 
ment of & distinct elements of X, i.e. an injective map 7 : [k] > X. The 
set of all k-permutations of X = [n] is denoted by P(n,k). In particular 
we have S,, = P(n,n). 
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Circular k-Permutation: We say that two k-permutations 7,72 € P(n,k) 
are circular equivalents if there exists a shift s € [k] such that the following 
implication holds: 


i+s=j (modk) =~ m(i) =72(J) 


This equivalence relation partitions P(n,k) into equivalence classes. A 
class is called a circular k-permutation. The set of all circular k-permutations 
is denoted by P.(n, k). 


Take for example 7, = 76123, m2 = 12376 and 73 = 32167. Then 7 and 72 


2s Re ee 
( TT 7 ( 72 1 ( 73 3 
Bs tal Ths oa fs ay 


Figure 8: Visualization of the 5-permutations 7, 72 and 73 where values are 
arranged in a circular fashion (the first value being on the right of the cycle). 
Note that 7, and 72 can be transformed into one another by “turning”. They 
are therefore circular equivalents. Flipping is not allowed however, so 73 is not 
equivalent to 7, and 7. 


are circular equivalents as witnessed by the shift s = 3. They are therefore 
representatives of the same circular 5-permutation (with elements from 
(7]). 23 belongs to a different circular 5-permutation. Consider Figure 8 
for a visualization. 


We now count the number of (circular) k-permutations. For this we need 
the notation of factorials: n! := 1-2-----n. 


Theorem 1.2. For any natural numbers 1<k <n we have 


(i) |P(n,k)| =n-(n-1)-...-(n-—k41) = CI 


(ii) |Po(n, k)| = Cea 

Proof. (i) A permutation is an injective map a : [k] > [n]. We count the 
number of ways to pick such a map, picking the images one after the 
other. There are n ways to choose m(1). Given a value for 7(1), there are 
n — 1 ways to choose 7(2) (we may not choose 7(1) again). Continuing 
like this, there are n —i+ 1 ways to pick 7() and the last value we pick is 
m(k) with n—k+1 possibilities. Every k-permutation can be constructed 
like this in exactly one way. 


The total number of k-permutations is therefore given as the product: 


[P(e B)| =m (mV (WRN = Be, 


(ii) We count P(n,k) in two ways: 
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n} 


First way: |P(n,k)| = tom Which we proved in (i). 


Second way: |P(n,k)| =|P.(n,k)|-k because every equivalence class in 
P.(n,k) contains k permutations from P(n, k) (since there are k ways 
to rotate a k-permutation). 


From this we get wom = |P.(n,k)|-k which implies the claim. 


n— 


1.3. Unordered Arrangements — 
Combinations, Subsets and Multisets 


Let X be a finite set and k € N. An unordered arrangement of k elements of X 
is a multiset S of size k with elements from X. 

Take for example X = {0,0,1,A,V}, then an unordered arrangement of 7 
elements could be S = {0,0,0,A,A, A, V}. Order in sets and multisets does not 
matter, so we could write the same multiset as S = {A,O0,A, V,O, A, O}. 

However, we prefer the following notation, where every type x € X occurring 
in S is given only once and accompanied by its repetition number rz, that cap- 
tures how often the type occurs in the multiset. The example above is written 
as: S = {2-0,1-0,3-A,1-V} and we would say the type O has repetition 
number 2, the type O has repetition number 1 and so on. We write O € S, 
71g = 2,0 € S, ro = 1 and so on. 


The difference between ordered and unordered arrangements is that ordered 
arrangements are selections of elements of X that are done one at a time, while 
unordered arrangements are selections of objects done all at the same time. 

A typical example for unordered arrangements would be shopping lists: You 
may need three bananas and two pears, but this is the same as needing a pear, 
three bananas and another pear. In a sense, you need everything at once with 
no temporal ordering. This is in contrast to something like a telephone number 
(which is an ordered arrangement) where dialing a 5 first and a 9 later is differ- 
ent from dialing 9 first and 5 later. 


As with ordered arrangements, the most important case for unordered ar- 
rangements is that all repetition numbers are 1, i.e. r, = 1 for all e € S. Then 
S is simply a subset of X, denoted by SC X. 


Definition 1.3. k-Combination: Let X be a finite set. A k-combination of 
X is an unordered arrangement of & distinct elements from X. We prefer 
the more standard term subset and use “combination” only when we want 
to emphasize the selection process. The set of all k-subsets of X is denoted 
by (¢) and if |X| =n then we denote 


) =|G)P 


k-Combination of a Multiset: Let X be a finite set of types and let M be 
a finite multiset with types in X and repetition numbers r1,...,7)x). A 
k-combination of M is a multiset with types in X and repetition numbers 
$1,..., |x| such that s; <r;,1 <i < |X|, and at 8, =k. 
If for example M = {2-0,1-0,3-A,1-V}, then T = {1-0,2-Ahisa 
3-combination of the multiset M, but T’ = {3-0,0-V} is not. 
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k-Permutation of a Multiset: Let M bea finite multiset with set of types X. 
A k-permutation of M is an ordered arrangement of k elements of M where 
different orderings of elements of the same type are not distinguished. 
This is a an ordered multiset with types in X and repetition numbers 
$1,..., |x| such that s; <r;, 1 <i < |X|, and ee 8, =k. 


Note that there might be several elements of the same type compared 
to a permutation of a set (where each repetition number equals 1). If for 
example M = {2-0,1-0,3-A,1-V}, then T = (0, A,A,0) is a 4-permutation 
of the multiset M. 


Theorem 1.4. For0<k <n, we have 


n! 


P(n,k) = @ -k! and therefore (;) = Ho=we 


Proof. To build an element from P(n,k), we first choose k elements from [n], 
by our definition of (j), there are exactly (i), ways to do so. Then choose an 
order of the k elements, there are |P(k, k)| = B = k! ways to do so. 

Every element from P(n,k) can be eousracied like this in exactly one way 
so |P(n,k)| = ({) - k! which proves the claim. 

Using Theorem 1.2(i) now gives the identity on the right as well. 


The numbers (7) for 0 < k < nare called binomial coefficients because these 
are the coefficients when expanding the n-th power of a sum of two variables: 


wry => (Paty 


k=0 


1.4 Multinomial Coefficients 
The binomial formula can be generalized for more than two variables. 


Definition 1.5. For non-negative integers k1,...,k, with kj +...+k, =n the 
multinomial coefficients G 7m an ) are the coefficients when expanding the n-th 
power of a sum of r variables. In other words, define them to be the unique 


numbers satisfying the following identity: 
(x1 fe +4 yr = S- n apt gk ake 
or . ; ny ae 


Example. Verify by multiplying out: 


t+ 29 +23) 


(x 
(aja + rr3 + x23 + £323 + 1423 + 1223) 
( 
(x 


3 
1 
2 2 2 272) 
1 
2 
vy 


(21 + v2 + x3)" => 


Noh 


1- 
+ 4. 
+ 6(ajr9 + xjx3 4+ £523 
+ 12-(x? 2003 + 212523 + ©1204). 


The coefficient in front of 272223 is 12, the coefficient in front of x?73 is 6 and 
the coefficient in front of x4 is 1. 
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According to the last definition this means: 


4 49) 4 - 4\_ 
7 ie 2,002 0,0, 4 


Theorem 1.6 (Multinomial Theorem). 
For non-negative integers k1,...,k, with ky +... +k, =n we have: 


n _[n n— ky Ni ea Bee n! 
ki,...,kr} \ka ko mes ky. kyle kobe sce pl 


Proof. The naive way to multiply out (a1 +...+2,)”" would be: 


: i rT ‘a 
(my +...4 47)" = - ate Mma 


W=l1li2g=1 in=l 


n summation signs 


The monomial 7;,2;, ...2;, is equal to xf! ....v*r if from the indices {i1,..., in} 
exactly k, are equal to 1, kp equal to 2 and so on. 
We count the number of assignments of values to the indices {i1,...,in} 


satisfying this. 


n 


Choose k, indices to be equal to 1: There are (e 


) ways to do so. 


Choose kz indices to be equal to 2: There are n — ky indices left to choose 


from, so there are Ce ways to choose ky of them. 


Choose k,; indices to be equal to j (j € [r]): There n — ky — ... — kj-1 
indices left to choose from, so there are (ae =i ways to choose kj 
J 
of them. 
Hence the multinomial coefficient (i.e. the coefficient of aftah? ...akr) is 


n = nr n—ky n ky ko “bes Kp—1 
ky,...,kp) ky ko _ ky. 


and the first identity is proved. Now use Theorem 1.4 to rewrite the binomial 
coefficients and obtain 
n! (n — ki)! (n— ky —...— kp_1)! 


Conveniently, many of the factorial terms cancel out, like this: 


kil(n=kiy! kel(n=kr—te) OE M(n—k—...—k)! 


Also, (n— ky —...—k,)! = (n — n)! = 0! = 1 so we end up with: 


n! 


ky! + kal-... + kyl 


5! 


Example. (3,0,2) = 310lat = ie ]o1 = 10. 


17 


Note that the last Theorem establishes that binomial coefficients are special 
cases of multinomial coefficients. We have for 0 <k <n: 


(3) 7 ae =F ~ te 5). 


We extend the definition of binomial and multinomial coefficients by setting 


Gs : ie) = 0 if kj = —1 for some i, and (",) = (,,1,) = 0. This makes stating 


peesy 


the following lemma more convenient. 


Lemma 1.7 (Pascal’s Formula). 
Ifn>1and0<k<n, we have 


More generally, forn > 1 and ky,...,k, > 0 with ky +...+k, =n, we have 
n " n—-1 


Proof. Note first, that in the case of binomial coefficients, the claim can be 


rewritten as: 
n 2 n—-1 i n—-1 
k,n—-k] \kyn-k-1 k-l,n-k 


so it really is a special case of the second claim, which we prove now. By 
definition of multinomial coefficients we have the identity: 


(ai +...+2,)" = Ne, @ . gaat 


kyt+...+kp=n 


Exploring a different way to expand (a; +...+4,)", we obtain: 


(ty +...+%,)" = (apt... +2,)- (ap +... + 2,)"7 


By substituting indices as kj = k; + 1 and kj := kj, (for j #7) this is equal to: 


= n—-1 
ee Se Sen ee ee 


kyt...tkp=n 
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Note that the condition k; — 1 > 0 is not needed, because for the summands 
with k; — 1 = —1 we defined the coefficient under the sum to be 0. We remove 
it and swap the summation signs, ending up with: 


i ie n—-1 
ecm ee) (wert) ceo 


ky,...,k72>0 i=1 
ky+...+kp=n 


n 


Now we have two ways to write the coefficients of (a; +... + 2,)" and 


therefore the identity: 


n 5 n—-1 
Kiscsteg he ~ ky,...,kj —1..., kp 


i=l 


as claimed. 


You may already know Pascal’s Triangle. It is a way to arrange binomial 
coefficients in the plane as shown in Figure 9. It is possible to do something 
similar in the case of multinomial coefficients, however, when drawing all coef- 
ficients of the form ener © the drawing will be r-dimensional. For r = 3, a 
“Pascal Pyramid” is given in Figure 10. 

Now that we know what multinomial coefficients are and how to compute 
them, it is time to see how they can help us count things. 


Example. How many 6-permutations are there of the multiset {0, A, A,,0,O0}? 
Trying to list them all ((O,A,A,0,0,0), (A,O,A,0,0,0),),(A,A,O,0,0,0),...) 
would be tedious. Well, there are 6 possible positions for O. Then there are 5 
positions left, 2 of which need to contain A. The three UO need to go to the three 
remaining positions, so there is just one choice left for them. This means there 
are 6- (3) - 1 = 60 arrangements in total. This is equal to () (3) (3) a Gosh 
which is no coincidence: 


Theorem 1.8. Let S be a finite multiset with k different types and repetition 
numbers 71,12,...Tk. Let the size of S ben = 11 +7r2+---+7rp. Then the 
number of n-permutations of S equals 


( n 
T1,-++5Tk 


Proof. Label the k types as a,,...,a,. In an n-permutation there are n positions 
that need to be assigned a type. First choose the r; positions for the first type, 
there are (") ways to do so. Then assign r2 positions for the second type, out 


oe 


of the n — r; positions that are still left to choose. That amounts to ( es 


choices. Continuing like this, the total number of choices will be: 
nm : nmr, ; ; n ry ra ne TRe-1 Thm 1.6 nr 
ry ow ae rk  * NPigW ast eb sites” 


We have seen before how the coefficient (7) = (,,,"_,) counts the number 
of k-combinations of [n] (ie. number of subsets of [n]). Now we learned, that 
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Re 
v 
n=07>1] ye 
res 
J 4 
n=1—7> 1 1 ye 
, ‘ 
n=2—-> 1 2 i ee 
\ ee Mp % 
J \V Jv Aan \ v 
n=3—>1 3 3 1 a 
J oe Np: Vv vv 
n=4—>] 4 6 4 1 2 


(eo, 5 Fie. as Bd 
$2659. 6h. 1 DO ae. - Gs 


Figure 9: Arrangement of the binomial coefficients a Lemma 1.7 shows that 


n 


the number (7) is obtained as the sum of the two numbers (77}) and ("7") 


directly above it. 


Figure 10: Arrangement of the numbers Gs eee) with ki +kg +k3 =nina 
pyramid. To make the picture less messy, the numbers in the back (with large 
ks) are faded out. The three flanks of the pyramid are pascal triangles, one of 


which is the black triangle in the front with numbers ( a . ‘lt The first number 


not on one of the flanks is ie 7 HY = 6. 


it also counts the number of n-permutations of a multiset with two types and 
repetition number k and n — k. How come ordered and unordered things are 
counted by the same number? We demystify this by finding a natural bijection: 

Consider the multiset M := {k-V,(n—k)-X} with types Vv and X (the 
“chosen” type and the “unchosen” type). Now associate with an n-permutation 
of M the set of positions that contain V, for instance with n = 5, k = 2 and 
the permutation (V,X,X,v”,X), the corresponding set would be {1,4} C [5] 
(since the first and fourth position received Vv). It is easy to see that every 
n-permutation of M corresponds to a unique k element subset of [n] and vice 
versa. 

Note that so far we only considered n-permutations of multisets of size n. 
What about general r-permutations of multisets of size n? For example, the 
number of 2 permutations of the multiset {D0 A,A, A, O} is 7, since there is: 


OA, Oo, Ag, AA, AO, oof, oF. 


Note that OO and OO is not possible, since we have only of one copy of O 
and O at our disposal. The weird number 7 already suggests that general r- 
permutations of n element multisets may not be as easy to count. Indeed, there 
is no simple formula as in Theorem 1.8 but we will see an answer using the 
principle of inclusion and exclusion later. 

There is a special case other than r = n that we can handle, though: If 
all repetition numbers r; of a multiset with & types are bigger or equal than 
r, for instance when considering 2-permutations of M := {0,0,A,A,A,0, O}, 
then those repetition numbers do not actually impose a restriction, since we 
will never run out of copies of any type. We would sometimes sloppily write 
M = {co-O, 00-A, 00:0} where the infinity sign indicates that there are “many” 
copies of the corresponding elements. The number of r-permutations of M is 
then equal to k”, just choose one type of the k types for each of the r positions. 

After permutations of multisets, we now consider combinations. 


Example. Say you are told to bring two pieces of fruit from the supermarket 
and they got ,@ and ® (large quantities of each). How many choices do you 
have? Well, there is: {, J}, {J, ©}, {S, B}, {G, ©}, {GB}, {BB}. s0 six 
combinations. Note that bringing a ® and an @ is the same as bringing an © 
and a ® (your selection is not ordered), so this option is counted only once. 


We now determine the number of combinations for arbitrary number of types 
and number of elements to choose. 


Theorem 1.9. Let r,k © N and let S be a multiset with k types and large repe- 
tition numbers (each r1,..., Tp is at least r), then the number of r-combinations 


of S equals 
k+r-1 
‘ , 


Proof. For clarity, we do the proof alongside an example. Let the types be 
1, 02,...,@, for instance k = 4 and aj = S, ag = ©, a3 = B, ag = BW. Then 
imagine the r-combinations laid-out linearly, first all elements of type a, then 
all of type az and so on. In our example this could be 


AJ® ® B for the combination {2-J,2-¥,1- B}. 
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Now for each i € [k — 1], draw a delimiter between types a; and a;+1, in our 
example: 


GI! i1vwi® 


Note that, since we have no elements of type az = ©, there are two delimiters 
directly after one another. Given these delimiters, drawing the elements is 
actually redundant, just replacing every element with “e” yields: 


The original k-combination can be reconstructed from this sequence of | and e: 
Just replace every e with a, until the first occurrence of |, then use az until the 
next I, then a3 and so on. So every (r+k—1)-permutation of T = {(k—1)-l,r-e} 
corresponds to an r-combination of S and vice versa. 

We know how to count the former, by Theorem 1.8 the number of (r+k—1)- 


permutations of T is Cia) - Ca 


Counting r-combinations of multisets where repetition numbers 1rj,...,7x 
may be smaller than r is more difficult. We will see later how the inclusion- 
exclusion principle provides an answer in this case. 


1.5 The Twelvefold Way — Balls in Boxes 


The most classic combinatorial problem concerns counting arrangements of n 
balls in k boxes. There are four cases: U > L, LU, LoL, U-wU, 
for arrangements of Labeled or Unlabeled balls in, Labeled or Unlabeled 
boxes. Here “labeled” means distinguishable and “unlabeled” means indistin- 
guishable. 

These four cases become twelve cases if we consider the following sub- 
variants: 


(i) No box may contain more than one ball. Example: When assigning stu- 
dents to topics in a seminar, there may be at most one student per topic. 


(ii) Each box must contain at least one ball. Example: When you want to get 
10 people and 5 cars to Berlin, you have some flexibility in distributing 
the people to cars, but a car cannot drive on its own. 


(iii) No restriction. 


A summary of the results of this section is given in the end in Table 1. 

Instead of counting the ways balls can be arranged in boxes, some people 
count ways that balls can being put into the boxes or being picked from the 
boxes, but this does not actually make a difference. We now systematically 
examine all 12 cases. 


1.5.1 U-L: n Unlabeled Balls in k Labeled Boxes 


Example 1.10. Torsten decided there should always be k = 30 points to a 
worksheet and n = 5 problems per worksheet. Stefan believes some problems are 
easier than others and deserve more points. He wonders how many ways there 
are to distribute the points to the problems. In this setting, balls correspond 
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to points (points are not labeled, they are just points) and boxes correspond to 
problems (they are labeled: There is problem 1,2,3,4 and 5). 
In other words, we search for solutions to the equation 


30 = 27, +%2+%3+%44 45 
with non-negative integers 21,...,%5, for example: 
30=8+44+5+5+4+7 or 380=104+5+4+844+3. 


The students say that they would like it if some of the problems are “bonus 
problems” worth zero points, so the partition 30 = 10+ 10+ 10+0+0 should 
be permissible. Torsten remains unconvinced. 


We come back to this later and examine three cases for general n and k in 
the balls-and-boxes formulation: 


< 1 ball per box Of course, this is only possible if there are at most as many 
balls as boxes (n < k). For n = 2 and k = 5 one arrangement would be: 


am Ww 


Each of the k boxes can have two states: Occupied (one ball in it) 
empty (no ball in it) and exactly n boxes are occupied. The ee of 
ways to choose these occupied boxes is (: 


> 1 ball per box Of course, this is only possible if there are at least as many 
balls as boxes (n > k). For example for n = 9 and k = 5: 


ww www 


To count the number of ways to do this, arrange the balls linearly, like 
this: 


GVWVWRW2VW2VWVWOEPOE 


and choose k — 1 out of n — 1 gaps between the balls that correspond to 
the beginning of a new box. In the example above this would look like 
this: 


OG B\A\@ B B\B\B@ Oe 


There is a bijection between the arrangements with no empty box and the 
choices of n — 1 gaps for delimiters out of k — 1 gaps in total. We know 
how to count the latter: There are Game possibilities. 

Going back to the example, we now know there are (Ce) solutions to the 
equation: 


%+%24+%34+%4+ 45 = 30 


where %1,...,%5 are positive integers. 
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arbitrary number of balls per box Now boxes are allowed to contain any 
number of elements, including 0. One example for n = 7 and k = 5 would 


SP OP CF Oe ay 


We count the number of such arrangements in three different ways: 


1. From an arrangement we obtain a string by “reading” it from left to 
right, writing Q when we see a ball and | when a new box starts. 
For the example above this yields the string 


GeQO!l!lloeoetle@etlee 


We see: The permutations of the multiset {n-Q,(k—1)- | } corre- 
spond directly to the arrangements of the balls. Note the difference 
to the case of non-empty boxes we discussed before. 


We know how to count the permutations of the multiset, by Theorem 


1.8 there are ("}*7") of them. 


2. There is a bijection between: 


(i) Arrangements of n balls in & boxes 
(ii) Arrangements of n + k balls in k boxes with no empty box. 


For a map from (i) to (di), just add one ball to each box. For the 


inverse map from (#i) to (4) just remove one ball from each box. 
We already know how to count (iz), there are (oe 


ments. 


) such arrange- 


3. Assume n,k > 1.To count the arrangements, first choose the number 
i of boxes that should be empty (0 < i < k — 1), then choose which 
boxes that should be (there are (*) choices) and then distribute all n 
balls to the remaining boxes k — 7 boxes such that non of those boxes 
is empty (we already know how to count this). This yields: 


3 k n—-1 
; i) \k-i-1/) 
1=0 
This looks different from the other two results, which means we “ac- 
cidentally” proved the non-trivial identity: 


Gay 7()) je :) (n,k > 1). 


30+6—-1 


ae ) solutions to the 


Going back to the example, we now know there are ( 
equation: 


30 = 27, +%2+%3+%44 45 


with non-negative integers, ic. that many assignments of points to the five 
exercises on exercise sheets such that the total number of points is 30. 
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Torsten thinks that no exercise should be worth more than 10 points. In 
the balls and boxes setting this limits the capacity r; of a box 7. This makes 
counting much more difficult but we will see a way to address this in Chapter 2 
using the principle of inclusion and exclusion. 

(As for homework problems, it turns out that Jonathan put \def\points{6} 
into his latex preamble, which settles the issue.) 


1.5.2 LU: n Labeled Balls in k Unlabeled Boxes 


Example 1.11. There are n = 25 kids on a soccer field that want to form k = 4 
teams. The kids are on different skill levels (e.g. a team of five bad players is 
quite a different thing than a team of five good players) but the teams are 
unlabeled (it wouldn’t make a difference if you swap all players of two teams). 
In how many ways can the n kids form k teams? In other words: In how many 
ways can the set of kids be partitioned into k parts? 


Here, kids correspond to balls and teams correspond to boxes. Again, we con- 
sider three subcases for general k and n. 


< 1 ball per box Of course, this is only possible if there are at most as many 
balls as boxes (n < k). Each ball will be in its own box. If the balls are 
labeled with numbers from [n], there will be one box with the ball with 
label 1, one box with the ball with label 2 and so on. In other words, there 
is only 1 possibility, for n = 3 and k = 5 the only arrangement looks like 
this: 


"Shs ff 


> 1 ball per box Of course, this is only possible if there are at least as many 
balls as boxes (n > k). 


This is the same as the number of partitions of [n] into k non-empty parts 


which we also call Stirling Numbers of the second kind and write as s/!(n). 


Some values are easy to determine: 


e sd/(0) = 1: There is one way to partition the empty set into non- 


empty parts: 0 = UxegX. Each X € 0) is non-empty (because no 
such X exists). 


e sd/(n) =0 (for n > 1): There is no way to partition non-empty sets 


into zero parts. 

e si/(n) = 1 (for n > 1): Every non-empty set X can be partitioned 
into one non-empty set in exactly one way: X = X. 

e sil (n) = ~>? = 2-1-1 (for n > 1): We want to partition [n] into 
two non-empty parts. If we consider the parts labeled (there is a first 
part and a second part), then choosing the first part fully determines 
the second and vice versa. Every subset of [n] is allowed to be the 
first part — except for 9 and [n]. This amounts to 2" — 2 possibilities, 
however, since the parts are actually unlabeled (there is no “first” 


or “second”) every possibility is counted twice so we need to divide 
by 2. 
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e s!!(n) = 1: There is only one way to partition [n] into n parts: Every 


number gets its own part. 


A recursion formula for s{/(n) is given as: 


sil (n) = ksi (n —1)+ sit (n —1). 


To see this, count the arrangements of the n labeled balls in unlabeled 
boxes with no empty box as follows: 


e The ball of label n may have its own box (with no other ball in 
it). The number of such arrangements is equal to the number of 
arrangements of the remaining n — 1 balls in k — 1 boxes such that 
none of those k — 1 boxes is empty. There are s// ,(n — 1) of those. 


e The box with the ball of label n contains another ball. Then, when 
removing ball n, there is still at least one ball per box. So removing 
ball n gets us to an arrangements of n—1 balls in k non-empty boxes. 
There are s{/(n — 1) of those and for each there are k possibilities 
where ball n could have been before removal (note that the boxes 
are distinguished by the balls that are already in it). So there are 


k- s!!(n — 1) arrangements where ball n is not alone in a box. 


The recursion formula follows from summing up the values for the two 
cases. 


Note that the Stirling Numbers fulfill a recursion similar to the recursion of 
binomial coefficients, so there is something similar to the Pascal Triangle. 
See Figure 11. 
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n=6>0 1 31 90 6 15 1 
Figure 11: “Stirling Triangle”. A number s//(n) is obtained as the sum of the 
number s// ,(n — 1) toward the top left and k times the number s//(n — 1) 
towards the top right. E.g. for n = 6,k = 3: 90 = 1543-25. 
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We will later see a closed form of s//(n) in Chapter 2 using the principle 
of inclusion and exclusion. 


arbitrary number of balls per box Empty boxes are allowed. To count all 
arrangements, first choose the number i of boxes that should be non-empty 
(0 <i<k), then count the number of arrangements of n balls into the i 
boxes such that non of them is empty. This gives a total of: 


1.5.3 L—5L: n Labeled Balls in k Labeled Boxes 


Example 1.12. At the end of the semester, each of the n = 70 students of 
combinatorics will be assigned one of the k = 11 grades from {1.0, 1.3,..., 5.0}. 
Both Torsten and the students think that the outcome 


would be preferable to the outcome 


www Ww 


so the boxes (grades) are clearly labeled (we could not draw the boxes 2.0 till 
4.0 due to lack of space) . Furthermore, Alice (A) and Bob (B) insist that they 
would notice the difference between the arrangement 


and the arrangement 


WWW 


so the balls (students) are labeled as well. Such arrangements correspond di- 
rectly to ane f : [n] — [kK], mapping each of the n balls to one of the k 
boxes, or here: Mapping every student to their grade. 


As before, we consider three subcases: 


< 1 ball per box Of course, this is only possible if there are at most as many 
balls as boxes (n < k). 


Such arrangements correspond to injective functions f : [n] > [k]. We 
first choose the image of 1 € [n] (there are k possibilities), then the image 
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of 2 € [n] (there are k — 1 possibilities left) and so on. Therefore, the 
number of injective functions (and therefore arrangements with at most 
one ball per box) is: 


Bead ysese (k-n+1)= i = (Fn 


> 1 ball per box Of course, this is only possible if there are at least as many 
balls as boxes (n > k). 


These arrangements correspond to surjective functions from [n] to [A]. 
They can also be thought of as partitions of [n] into k non-empty distin- 
guishable (!) parts. So we count the number of ways to partition [n] into 
k non-empty indistinguishable parts (there are s//(n)) and multiply this 
by the number of ways k! to assign labels to the parts afterwards. So in 
total, there are 


kisi! (n) 
ways to assign n labeled balls to k non-empty labeled boxes. 


arbitrary number of balls per box There are & choices for each of the n 
balls, so &” arrangements in total. 


1.5.4 UU: n Unlabeled Balls in k Unlabeled Boxes 


Example 1.13. Adria used to play tennis but now wants to sell her old tennis 
balls (n in total) on the local flee market. She wants to sell them in batches 
since selling them individually is not worth the trouble. So she packs them into 
k bags. Some people may want to buy bigger batches than others so she figures 
it might be good to have batches of varying sizes ready. She wonders how many 
ways there are to pack the balls into bags. One such arrangement (n = 9, k = 4) 
could be: 


wi i i 


Balls and boxes are unlabeled. Adria cannot distinguish any two balls and can 
also not distinguish boxes with the same number of balls. 

Even though boxes have no intrinsic ordering, we need to somehow arrange 
them on this two-dimensional paper. In order to not accidentally think that 
DF and WP are different, we use the convention of drawing boxes 
in decreasing order of balls. With this convention, an arrangement will look 
different on paper if and only if it is actually different in our sense. 

With this in mind we see the number of arrangements of n unlabeled balls 
in k unlabeled boxes is equal to the number of ways to partition the integer n 
into k non-negative summands. For example: 


9=44+2+4+2+1. 


Partitions where merely the order of the summands differ are considered the 
same, so again we use the convention of writing summands in decreasing order. 
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< 1 ball per box Of course, this is only possible if there are at most as many 
balls as boxes (n < k). 


In that case, there is only one way to do it: Put every ball in its own box. 
Then there are n boxes with a ball and k — n empty boxes. 


> 1 ball per box Of course, this is only possible if there are at least as many 
balls as boxes (n > k). 


As discussed before, we count the number of ways in which the integer n 
can be partitioned into exactly k positive parts, i.e. 


nm=a,tagt+...+agz, where ay >a2>...> az > 1. 


This is counted by the partition function, denoted by pz(n). A few values 
are easy to determine: 


© po(n) = 0 (for n > 1): No positive number can be partitioned into 
zero numbers. 


© pr(n) = 1: To write n as the sum of n positive numbers, there is 
exactly one choice: 
nm=14+1+...1. 
—_—-___’ 


ntimes 


To compute other values, observe the following recursion: 
pe (n) = pe(n — k) + pp-i(n — 1). 
To see this, observe the two cases: 
e Either a, = 1. Then a, +...+az_ 1 is a partition of n — 1 and there 


are px—1(n — 1) ways for this. 
e Or ay, > 2. This means each a; is at least 2 (for 0 <i<k). There is 
a bijection between: 
(i) Partitions of n into k parts of size at least 2. 
(ii) Partitions of n — k into k parts of size at least 1. 


To go from (2) to (#), just remove 1 from each part and to go back, 
just add 1 to each part. We already know how to count (iz), there 
are p(n — k) such partitions. 


arbitrary number of balls per box Similar to what we did in the L ~- U 
case, first decide how many boxes 7 should be non-empty and then count 
how many arrangements with exactly 7 non-empty boxes exist: 


k 


This can be thought of as the number of integer partitions of n into at 
most k non-zero parts. 
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n balls k boxes <1 per box >1 per box arbitrary 


U L (") (ea) Pa) 

L U 1 sl! (n) ye s(n) 

L L (*) n! si! (n)k! kn 

U U 1 p(n) an (n) = iy Pi(n) 


Table 1: The twelvefold way. 


1.5.5 Summary: The Twelvefold Way 


The twelve variations of counting arrangements of balls in boxes are called the 
Twelvefold Way. A summary of our results is given in Table 1. 

We also summarize the interpretations of n Labeled or Unlabeled balls in k& 
Labeled or Unlabeled boxes: 


Arrangements Correspond to 


U-L non-negati ve Integer solutions of 71 +...+ a7, =n. 
L-U Partitions of the set [n] into k parts. 

L-L Functions from [n] to [k] 

U-—U Partitions of the number n into k non-negative integers. 


1.6 Binomial Coefficients — Examples and Identities 


Example 1.14 (Lattice Paths). A monotone lattice paths in the grid D = 
{0,1,...,m}x{0,1,...,n} is a path starting in the bottom left corner (0,0) and 
ending in the top right corner (m,n), taking single steps upwards or rightwards 
see Figure 12. 


(7,4) 


(0,0) 


Figure 12: Here m = 7 and n = 4. The lattice path -t-t---t->>F is 
shown. 


In other words, a lattice path is a sequence of “{” (upward steps) and “+” 
(rightward steps) with a total of n times + and m times —. In yet other words, 
the lattice paths correspond to permutations of the multiset {m-—,n-t} and 
by Theorem 1.8 their number is 

m+n 
ve 
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Example (Cake Number). For positive integers m and n the cake number 
c(m, n) is the maximum number of pieces obtained when cutting an m-dimensional 
cake by n cuts, i.e. the maximum number of connected components of R™ \ 
Us, Hi, where H; is an (m—1)-dimensional affine hyperplane. For m = 2, this 
simply means: Put n lines into the plane and observe into how many pieces the 
plane is cut. See Figure 13 for an example. 


Figure 13: The plane (m = 2) is cut by n = 4 lines. In this particular configu- 
ration, this gives 11 pieces. No configuration with this n and m can achieve a 


larger number of pieces. 
m n 
c(m,n) = ( : . 
ars? 


Here (") is defined to be zero for i > n. For instance 


c(2,4) = (5) + (1) + (3) =14+44+6=11, 


so the example given in Figure 13 is tight. 


It turns out that 


The cake numbers arise in a seemingly unrelated setting: For m > 3, consider 
any finite set X of n points in R™~! in general position (no m points on a 
hyperplane, no m+ 1 on a (m — 2)-dimensional sphere). Then the number of 
subsets Y of X that can be separated from X \ Y by a (m — 2)-dimensional 
sphere is equal to c(m,n). 

For m = 3 this means: Consider n points in the plane where no 3 points 
are on a line and no 4 points on a circle?. Then the number of subsets Y of X 
that can be captured by a circle is equal to c(3,n). Consider Figure 14 for an 
example. We count all sets that can be captured there. 


size 0: The empty set. 

size 1: All 5 single element sets. 

size 2: All except for {b,e} and {a,c}, so (3) —2=8. 

size 3: {a,b,c}, {a, b, d}, {a, b, e}, {a, d, e}, {b, c, d}, {b, d, e}, {c, d, e}, so 7 of them. 


size 4: All except for {a,b,c,e}, so 4 of them. 


?These assumptions are very week: If you pick random points, say by throwing a bunch of 
darts onto the plane, the probability that the points will be in general position is 1. 
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SZ 


ee 


Figure 14: There are n = 5 points in the plane (m = 3). Some subsets can be 
captured by a circle (the pictures shows: {a, d}, {b,d,e}, {c,d,e}, {c},0), some 
cannot, for example {b,c,e} (such a circle would also contain d). 


size 5: The set {a,b,c,d,e}. 


That is 1+5+8+7+4+41 = 26 sets in total, exactly the cake number 
c(3,5) = (9) + 3) + @) + G@) =1454104 10 = 26. 


There is a seemingly infinite number of useful identities involving binomial co- 
efficients, we will show a few of them. 


Theorem 1.15 (Proofs by Picture). We have: 


. ue nm 2_(n n 
(i) DP ea aac a) 


(ii) Xk ase, 


Proof. (i) For the first identity, arrange 1+2+...+/ dots in a triangle, mirror 
it, and observe that this fills all positions of a square of size (n+1) x (n+1) 


except for the diagonal. Here is a picture for n = 6: 
e@eaooae508 eee ee@e 

@eeeoeeoe8 @eee8 e 

eee @o eee @e ee 

eee —_7 [eee eee 

fee) ey lele| eeele 

e e ee@eeo 8 @ 

e eee @ 


Therefore 2-S>k = (n+ 1)? — (n+1) =n(n+1). Dividing by 2 proves 
1 
the claim. 


(ii) Note how tiles of with sizes of the first n odd numbers can be arranged to 
form a square of size n x n, here a picture for n = 5: 


Theorem 1.16 (Proofs by Double Counting). We have 


(i) X 2B) = 3", 
(ii eee. = Cen) 


Proof. (i) We double count strings over the alphabet {0,1,2} of length n, in 


(ii 


— 


other words, the n permutations of the multiset {oo -0,00-1,00-2} where 
there is infinite supply of the types 0, 1, 2. 


First way: There are three possibilities per character, so 3” possibilities 
in total. 


Second way: First choose the number of times k that the letter “0” 
should be used (0 < & < n). Then choose the positions for those 
characters, there are (j) possibilities. Finally, choose for each of the 
remaining (n — k) positions if they should be 1 or 2, there are 2"~* 
choices. So in total, there is this number of possibilities: 


aa) 


By reversing the order of summation, this is equal to 


eee ee) 


This proves the claim. 


We double count the lattice paths in the lattice {0,...,m} x {0,...,n} of 
width m and height n. 
First way: We already counted them in Example 1.14. There are (eee) 


of them. 

Second way: Any path must reach the last row (row n) eventually, using 
exactly one of the “{”-steps from (k,n—1) to (k,n) where 0 <k < m. 
See Figure 15 for a sketch. 
There is only one way a lattice path can continue from (k,n), namely 
go rightwards until (m,n). There are (ae) ways to get from (0, 0) 
to (k,n — 1) though, so the number of lattice paths is: 

a (* +n— 7) 

k=0 


This gives the equality Ca?) yr (eae In the claim, we merely 
replaced n by n+ 1. 


A few other identities can be derived by using the connection of multinomial 
coefficients to polynomials. 
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(0,0) 


Figure 15: At some point, a lattice path must cross the dashed line, i.e. use 
one of the edges going to the last row. We count the number of paths using the 
highlighted edge from (4,3) to (4,4): There are ee ways to get from (0,0) to 


(4,3) and one way to get from (4,4) to (7,4), so Cy) - 1 paths in total. 


Theorem 1.17 (Proofs by Analysis). We show three equations (2), (ii) and 
(iit), see below. 


Proof. Start with the binomial formula: 


n n 4, n—t 
(x+y) =, ("Ja yoo. 
i=0 

Setting y = 1 yields: 


n n 

$ 1 n = es 

@Q @ryr= > ("Ja 
Deriving by x gives: 


n\ ; 
1 n—-1 _ : iT 
n(x +1) Di(i)e 
Now set x = 1 and get: 


“[n 
a : gn-1 = . 
(ii) n » i (") 
Taking the second derivative of (i) yields: 


n(n —1)(2@ +1)? = 3 i(i—1) (") 7 , 


j=2 
Again, we set x = 1 


n(n — 1)2"-? = s i(i—1) UO) 


i=0 
Adding (iz) to this gives: 


(iii) n(n +1)2"-? = > 2 ("). 


i=0 


1.7 Permutations of Sets 


Remember that n-permutations of [n] are bijections 7 : [n] > [n]. From now on 
we will just say “permutation of [n]” and drop the “n-”. 
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For such a permutation, the pair (2,7) of two numbers from [n] is called an 
inversion if the numbers are ordered, but swapped by the permutation, i.e. 


(i, 7) is inversion @ i <j Am '(i) > 771(j). 


Take for example the permutation 7 = 31524. It has the inversions (1,3), (2,3), 
(2,5), (4,5). 
For i € [n] define a; := |{j € [n] | (4,7) is inversion}|. The inversion sequence 
of m is the sequence ajQ2...Qpn. 
The inversion number or disorder of a permutation 7 is the number of its 
inversions, so @,+---+a,. For our example the inversion sequence is 1, 2,0, 1,0 
and its disorder is 4. 

Since for any 7 € [n], there are only n—i numbers bigger than 7, any inversion 
sequence Q@1,...,Q@,, satisfies: 


0<a;,<n-i (i € [n]). (x) 


There are exactly n! integer sequences satisfying (*) (n choices for a,, n — 1 
choices for ag ...), just as many as there are permutations. This is no coin- 
cidence: We now show that each permutation has a distinct such sequence as 
an inversion sequence. By cardinality we then also know that each such se- 
quence is the inversion sequence of some permutation, i.e. there is a one-to-one 
correspondence. 


Theorem 1.18. Any permutation can be recovered from its inversion sequence. 


Proof. Let a = (a1,...,Qn) be integers satisfying (x). We will construct a 
permutation 7 with this inversion sequence and it will be clear that we will not 
have any freedom in the construction of 7, i.e. 7 will be the unique permutation 
with inversion sequence @. 

In the beginning, 7 has n free positions still to be determined: 


T= 
— 


n positions 


We determine the position containing 1, i.e. we determine 7~!(1): Remember 
that a, is the number of elements that are bigger than 1 but arranged to the 
left of 1 by a. Since every element is bigger than 1, a, is just the number of 
elements arranged to the left of 1, in other words 7~1(1) = a; +1. 

Take for instance the inversion sequence a = (5,3,4,0,2,1,0,0). Then from 
a, = 5 we know that 1 is in the sixth position like this: 


T= 1 


Now recall that a2 is the number of elements bigger than 2 that are sorted to 
the left of 2, this means, a2 is exactly the number of (currently) unoccupied 
positions to the left of 2. In the example we have a2 = 3, so 7 must look like 
this: 

T= 2 1 


For general: i € [n], if we have already placed all number from 1 to 7 — 1, we 
can derive that 7~1(i) must be the position of the free slot that has exactly a; 
free slots to the left of it. 
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In the example, since ag = 4 we put 3 to the unoccupied position that has 
four unoccupied positions to the left of it: 


Continuing this way, we can reconstruct 7 completely: 


m=47625138. 


1.7.1 Generating Permutations 


To generate all permutations of [n] use the following algorithm. 


Start: Write 1,2,...,n with left arrows over them. 


Step: — Locate the largest mobile integer m, i.e., an integer such that 
it arrows to a smaller neighbouring number. 
— Swap m with that neighbour. 


— Change the direction of the arrows on all integers p, p > m. 


Let us illustrate this method with the following example. 


“~ 


t 234 123 
T3243 
TZO3 
2s SF 
473 9. 2-34 
T2439 
T3449 
1 3.04 
a a ai 
aie 2 
3479 
i 
43.0 1 32 4 
3407 
3547 
S014 
sae aR Sas 
O34 1 
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1.7.2 Cycle Decompositions 


Remark. The lecture mentioned a puzzle presented on the Youtube Channel 
MinutePhysics. It may have something to do with cycles (but we do not want 
to spoil the puzzle for you), feel free to check it out on 


http://youtube.com/watch?v=eivGIBKIK6M 


We already know several ways to specify a permutation. 


Write it as a string: 7 = 78546132. 


Write it explicitly as a map: 
i |1 2 3 4 5 67 8 
mi)|7 8 5 4 6 1 3 2 
Another way to specify the map would be to connect i with 7(i), maybe 


like this: 
123 45 67 8 


123 45 6 7 8 


In the last drawing every number occurs twice (once as a preimage and 
once as an image), now we only take one copy of every number and give 
a direction to the edges (from preimage to image): 


7 
( 6 28 


/ 


In other words, we draw an edge from i > j if 7(¢) = j. Since 7 is a map, 
every node gets one outgoing edge and since 7 is bijective, every node gets 
exactly one incoming edge. From this it is easy to see that the image we 
get is a collection of cycles: When starting at any node 7 and following the 
arrows, i.e. walking along the path 1 > m(t) > a(a(2)) > a(a(m(t))) 9 

.., we must at some point — since there are only finitely many elements 
—come for the first time to an element where we already were. This must 
be i, since all other nodes already have an incoming edge. So 7 was indeed 
in a cycle. 


a. 


() 
4 


Hees oe 
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Instead of drawing the picture we prefer to just list the cycles like this: 


m = (17356) (28)(4) 


” 


where cycles are group by “()” and elements within a cycle are given in 
order. The representation at hand is called a disjoint cycle decomposition. 
Note that every element occurs in exactly one cycle (some cycles may have 
length 1). Note that this composition is, strictly speaking, not unique, we 
could also write 7 = (4)(73561)(28), where we changed the order of two 
cycles and “rotated” 7 to the beginning of its cycle. In the following, we 
will not distinguish disjoint cycle decompositions that differ only in these 
ways. 


Before we generalize disjoint cycle decompositions to arbitrary cycle decom- 
positions, we define the product of two permutations: 

If 71,72 € S, are permutations, the product 7 - 72 (or just 7172 for short) 
is the permutation 7 with m(7) = 72(71(2)) (ie. first apply 71, then 72). Note, 
that when viewed as maps: 7 - 72 = 72° 7}. 

Now if we write a single cycle, e.g. 7 € $7, 7 = (137), we mean the permu- 
tation that permutes the elements in the cycle along the cycle (here 7(1) = 3, 
m(3) = 7, 7(7) = 1) and leaves all other elements in place, in our example 7 is 
the permutation with the disjoint cycle decomposition (137)(2)(4)(5)(6). 

We can now talk about cycle decompositions where elements may occur more 
than once, e.g. 7 = (136)(435)(7452)(6)(23). It is the product of the involved 
cycles ( (136), (435), ...). 

If 7 € S, is a permutation given as a cycle decomposition, then 7 can be 
evaluated by parsing, where parsing an element i € [n] in a cycle decomposition 
means 


e Have a current element c, initially c = 7. 
e Go through cycles from left to right. 


— If the current element c is part of a cycle, replace c with the next 
element in this cycle. 


e In the end of this process m(¢) is the current element c. 


Take for instance 7 = (136)(435)(7452)(6)(23) and i = 3. Then we start with 
the current element c = 3. The first cycle (136) contains c = 3, so we change 
c to 6 and go on. The next cycle (435) does not contain c = 6, and neither 
does (7452) so we move past these cycles without changing c. The next cycle 
(6) contains c = 6 but does not alter it. We therefore end with 7(3) = 6. As 
another example, consider how (1) is evaluated: 


c=1 c=3 c=5 c=2 .c=2 c=3 


we “ 1 ee ares 
(136) (435) (7452) (6) (23) 


So a(1) =3. 


Theorem 1.19. If 71,72 € S, are permutations given in cycle decomposition, 
then a cycle decomposition of 7-72 is given by concatenating the cycle decom- 
positions of ™, and m2. 
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Proof. Let the cycle decompositions be m1) = C1C2...Ck, 72 = CiCh...C}. 
When parsing C1C2...C,C{C...C/ with 7 © [n] then the current element 
after CiC2...C, will be 71(i) and therefore, after going through the remaining 
cycles Ci...C/, the current element will be 72(m1(t)) = (m1 - 72)(i). So the 
concatenation is indeed a cycle decomposition of 71 - 7%. 


Note that non-disjoint cycle decompositions do not always commute, e.g. 
(435)(321) = (13542) A (15432) = (321)(435) 


where the identities can be verified by parsing. The claims of the following 
theorem are easy and we will skip the proofs: 


Theorem 1.20. Let 7 € S,, be given in cycle decomposition. 


e Swapping adjacent cycles has no effect if they are disjoint (i.e. no number 
occurs in both cycles), e.g. (13)(524) = (524)(13). 


e A cycle decomposition of t~! is obtained by swapping the order of all 
cycles and reversing the elements in each cycle, e.g. ((321)(435))) = 
(534)(123). 


e Cyclic shifts in a cycle have no effect. (534) = (345) = (453). 


e Up to cyclic shifts and order of cycles, there is a unique decomposition 
into disjoint cycles. 


Let id € S;, be the identity permutation ( id(i) = 7 for 7 € [n]). Define the order 
of 7 € S,, to be the smallest k such that 


weain-nm:...:T=id. 
~~ 


k, times 


Theorem 1.21. The order of 7 is the least common multiple of the lengths of 
the cycles in the disjoint cycle decomposition of 7. 


Proof. Assume 7 is composed of the disjoint cycles C,C2...C;, and an element 
i € [n] is contained in a cycle C; of length 1. Then for any k € N, the value 
m* (i) is obtained by parsing i through i 

C1C2...CmC1C2...Cm...CiC2... Cm 

———_ SS 

k copies of Cy ...Cm 

Only the copies of C; can affect the current value if we start with 7, so the result 
is the same as when parsing 7 through k copies of just C;. From this we see 
that 7*(i) =7 if and only if k is a multiple of |. This shows that 7* = id if and 


only if k is a common multiple of the cycle lengths. So, the order of 7 is the 
least common multiple of them. 


1.7.3. Transpositions 


A cycle of length 2 is a transposition. 

Define discriminant of 7 as N(m) := n — #C where #C is the number of 
cycles in a disjoint cycle decomposition of 7. Note that we may not omit single 
element cycles now, they count towards #C: 

In Ss we have for example N(id) = N((1)(2)(3)(4)(5)) = n —5 = 0 and 
N((134)(25)) =5-2 =3. 
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Theorem 1.22. 
(i) Any permutation can be written as the product of transpositions. 
(it) If x is the product of k transpositions (k © N), then N(x) and k have the 
same parity (i.e. N(w) =k (mod 2)). 


Proof. (i) Let 7 be any permutation. We already know that we can write 7 
as a product of cycles. So to show that a can be written as a product 
of transpositions, it suffices to show that any cycle can be written as the 
product of transpositions. 

Let C = (a,a2...a;) be a cycle of length J (assume | > 2, since cycles of 
length 1 correspond to identity permutations and can be omitted from any 
cycle decomposition). 


Then we can write Cas the product of | — 1 transpositions: 
C= (a1a2 ayaa ar) => (ai-141) (Q1-2a1-1) aa 8 (a2a3)(a1a2) 


Verify this by parsing: If i 4 1, then parsing a; yields: 


Cc=a; Cc=a;j C=Q;, C=QA;j C=Gi4+1 C=Qi41 CHGi41 CH=Gi41 Cc=ai4 


{ 1 + J 1 1 1 1 { 


(ai-141) (ai-241-1) o> nie (aiai41) (aj—14;) wis (aza3) (a1a2) 


So parsing sends a; to aj;+1 as desired. When parsing a; we have: 


c=al c=a1-1 c=a1-2 c=az c=a2 c=ay 
(ai_141) (aj—241-1) see (a2a3) (a1a2) 
so we get a, as desired. 


(ii) Let 7 be any permutation with k cycles in a disjoint cycle decomposition. 
Let (ab) be any transposition. We determine the number of cycles in a 
disjoint cycle decomposition of (ab) - 7 depending on a and b. 


Case 1: a and 6 are part of different cycles in 7. Then there is a dis- 
joint cycle decomposition of 7 looking like this: 
mw = (aX)(BY)C3...C 
where X and Y are sequences of elements and C3, ..., Cy are cycles. 
By parsing we verify: 
(ab) -m = (ab)(aX)(bY)C3...Cy = (€YOX)C3...Cy. 


The last term is a disjoint cycle decomposition of (ab) - 7 into k — 1 
cycles. 


Case 2: a and 6 are part of the same cycle in 7. Then there is a dis- 
joint cycle decomposition of 7 looking like this: 


mw = (aXbY)Cg...Cy 


where X and Y are sequences of elements and C2, ..., Cy are cycles. 
By parsing we verify: 


(ab) - 7 = (ab) (aX bY )C2...Ch = (aY)(bX)C2... Cg. 


The last term is a disjoint cycle decomposition of (ab) - 7 into k +1 
cycles. 
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In both cases, adding a transposition changed the number of cycles by one, 
which means that the parity of the number of cycles changed. Therefore, 
the claim follows by induction. 


We call the number s{(n) of n-permutations of [n] with exactly k cycles in 
a disjoint cycle decomposition the unsigned Stirling number of first kind. More 
precisely s/(n) = |{7 € S, | N(r) =n—k}|. 


Theorem 1.23. For alln,k > 1 


(i) 8§(0) =1 and s/(0) = sd(n) =0, 


(ii) sl(n) = (n—1)si(n-—1) + s1_,(n—-1). 


Proof. (i) The empty permutation is the only permutation of 0 elements. 


Thus s}(0) = 1 and s4(0) = 0, since the empty permutation has no cycle 
andk> 1. 


On the other hand every permutation of n > 1 elements has at least one 
cycle and hence s}(n) = 0. 


Let S* c S,, denote the set of permutations in S,, with exactly k cycles 
in a disjoint cycle decomposition. Then |S*| = s/(n). We define a map 
ob: Sk — Sk_, US*7} as follows. Consider a permutation 7 € S* and 
let (AnB) denote the cycle containing n (A and B may be empty). If 
(AnB) = (n), remove the entire cycle (n). Otherwise replace the cycle 
(AnB) with (AB). In the first case there are k — 1 cycles in a disjoint 
cycle decomposition of ¢(7). In the latter case $(7) still has k cycles in a 
disjoint cycle decomposition. 


Consider a disjoint cycle decomposition of some ¢ € S*k_, US*7}. We 
count how many permutations 7 € S* are mapped to o. 


Case 1: o € cies . Then the only permutation in S* mapped to o is 
obtained by adding the cycle (n) to o. 


Case 1: ¢ € S*_,. In this case we may place n between any element 
x € [n — 1] and its successor in o (i.e., put it right behind z in the 
cycle containing x). Then n has different positions for distinct choices 
of x € [n— 1]. Hence n — 1 distinct elements from S* are mapped 
to o. 


Combining these two cases we see that |S*| = (n — 1)|S*_,| + |S*7]. 


Recall that the Stirling numbers of second kind s4/(n) count the number of 
partitions of [n] into k non-empty parts. For these numbers we had the formula 


IT 


si!(n) = ksi!(n — 1) + s{4,(n—1). There is also a similar picture of the 


recursion, see Figure 16. From the picture we easily see that s/(n) = (n — 1)! 
(for n > 1). This is the number of different cycles of length n formed by n 
elements. 
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n=6>0 120 274 225 8 15 1 


Figure 16: A number s/(n) is obtained as the sum of the number s/_,(n — 1) 


toward the top left and (n — 1) times the number s/(n — 1) towards the top 
right. E.g. for n = 6,k = 3: 225 = 50+5- 35. 


1.7.4 Derangements 


A permutation 7 € S$, is a derangement if it is fixpoint-free, ie., Vi € [n] : 
m(i) #1. Note that this means that each cycle has length at least 2. The set of 
all derangements of [n] is denoted by Dy. 

We close this chapter by “counting” derangements in Theorem 1.24. This 
is also meant to demonstrate that there are several different ways to count. 
Depending on the purpose, different counts may be more or less helpful. 


Remark. You might find the notation |D,,| =!n in the literature (sic!). Since we 
think this notation is too confusing we won’t use it here. 


Before we start observe that |D,| = 0 and |Dz| = 1. We may also agree 
on |Do| = 1, since the empty permutation has no fixpoint, but we try to avoid 
using Do. 
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Theorem 1.24. For a natural number n > 1 we have 


|Dn| = (n= 1)(|Dn-1| + |Dn—2|), ifn > 2, (Recursion (i)) 
|Dp| = n|Dn—1] + (-1)”, (Recursion (ii)) 
n a n—-1 7 
nl = S- & Del, |Dpl =n! — Se @ |Dz|, | (Recursion (iii)) 
k=0 k=0 
|Dn| = 3 . (—1)"** kl =n! 3 (p* (Summation) 
nil k a> if k! z 
k=0 k=0 
nl 1 ot 
|\D,|=]—+=], (Explicit) 
e 2 
Dyn| ~ V200n Z ~ cP 8” for some c € R), (Asymptotic) 
enti 
[Dn n ei Ms . 
S- ie ae Va €R\ {1}. (Generating Function) 
a -—2£ 
Proof. 


Recursion (i): We prove this statement using a map ¢: D, + Dn—1U Dn_2 
as follows. Consider a derangement 7 € D, and let x,y € [n] be the 
unique numbers such that 7(2”) =n and 2(n) = y. Written as a string, 7 
is of the form 7 = AnBy where A is a sequence of x — 1 elements and B 
is a sequence of n — x — 1 elements. 


If (case 1) x # y, then define $(7) := AyB. Note that o = $(7) € Dn-1, 
since o(a) = y # «x. For example, if 7 = 45123, then x« = 2, y = 3 and 
o(m) = 4312. 

If (case 2) x = y, then define ¢(7) := A*B* where A* and B* are 
identical to A and B except that all numbers bigger than x are decreased 
by 1. We claim that o = (7) € Dy_2. If i < x, then o(t) = o(i) Fi if 
o(t) < aw, and o(t) = o(i) -1 >a >i if (t) > x. Otherwise i > x and 
o(i) = d(i+1) <2 < iif O(i+1) <2, and o(i) = (i+1)-1 Zi+1-1 =1 
if d(¢+1) > a. For example, if 7 = 45132, then « = y = 2, AB = 413 and 
o(m) = 312. 


Claim. Eacho € D,~1UD;~2 is the image of exactly (n—1) permutations 
TE Dn. 


Case 1: 0 € Dy_ 1. Then there are n — 1 choices to pick a position 
x € [n—1]. Let o(x) = y, write o = AyB and define 7 = AnBy. 
Then ¢(7) =o. Note that 7 € Dy, since r(x) =n # x. 

Case 2: go € Dy_2. Then there are n — 1 choices to put n on position 
x (as the first element, into a gap or after the last element). Then 
there is a unique way to increase o(i) by 1 for each i with o(t) > a. 
Finally put x on position n. We obtain 7 = AnBz, which is mapped 
to a. Note that 7 € D,, due to similar arguments as above. 


Recursion (ii): We will apply induction on n. An induction basis is given by 
|Dz| = 1 and |D,| = 0. Suppose n > 3. We rewrite Recursion (i) as 
|Dn| — n|Dn—1| = —(|Dn—1| — (n — 1)|Dn—2|). By induction hypothesis, 
the right side is equal to —(—1)"~! = (—1)” which proves the claim. 
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Remark. For the number |5;,| = P(n) = n! of permutations of [n] we have 
a similar recursion |S,,| = (n—1)(|Sn—1|+|Sn—2|), since n! = (n—1)((n—- 
1)!+ (n — 2)! and |S;,| = n|Sp-1]. 


Recursion (iii): We count all permutations in S,. On the one hand |S,,| = n!. 
On the other hand each 7 € S;, induces a derangement on [n]\ F' (7), where 
F(z) is the set of all fixpoints of 7. Thus there are (%)|D,—,| different 
permutations in S;, with exactly k fixpoints each. Hence n! = |S,,| = 


aaa (F)|Dn—al- 


summation: Proof by induction on n. Ifn = 1, then |D,;| =0= 1 ae 
1(1— 1). This gives an induction basis. 


(<p* _ 
m= 


Consider n > 2. Then, using the induction hypothesis (IH): 


n-1 
n IH eat n 
Dal = n|Daal + (-1y" Bain — yt) Say 
k=0 
n—-1 n 
Gu ae (a1 
=nl\> kl + (I) =n! A 
k=0 k=0 
explicit: Recall that e* = )7.59 a: Then 
n (-1* n k 
|Dn| OL Se cr ee ir. 2a: 2 
ete a ee rae 


n 


So for sufficiently large n we have |D,| ~ a Applying standard argu- 


ments for the speed of convergence of series we obtain |D,,| = | = +4]. 


asymptotic: This follows immediately from Stirling’s formula n! + /2an (2)" 
and the explicit result. 


|Dn| 


generating function: Let F(x) := )0,5) Gra". Then the derivative of F 


is given by F’(z) = Do s4 Paling nl — Dest aban). From Recur- 


n! 
Dn Dn Dn 
[Patil 1Dal _ [Pn-1l and thus 


(n-1)! ~ (n—-1)! 


sion (i) we have 


This differential equation with F'(0) = 1 is solved by F(a) = 


1-z° 


2 Inclusion-Exclusion-Principle 
and Mobius Inversion 


In the last theorem of the previous chapter, and in several other places, we have 
seen summations with alternating signs. This chapter will deal with such kind 
of results. Consider a finite set X and some properties P,,...,P,, such that 
for each « € X and each i € [m] it is known whether x has property P; or 
not. We are interested in the number of elements from X satisfying none of the 
properties P;. 


Example. Let X be the set of students in the room and let P; and P2 be the 
properties “being male” and “studies math”, respectively. Suppose there are 36 
students, 26 of which are male and 32 of which study math. Among the male 
students 23 study math. We are interested in the number of students which are 
neither male nor study math. Let X , X2 denote the set of male students and 
the set of math students, respectively. Then 


|X \ (X1 U Xe)| = |X| — | Xa] — [Xe] + [Xan Xo] = 36 — 26-324 23 =1. 


See Figure 17. 


36 students 


1 


Figure 17: When there are 26 male and 32 math students among 36 students 
in the class, and 23 of the male students study math, then there is exactly 1 
female student who does not study math. 


2.1. The Inclusion-Exclusion Principle 


Let P,,..., Pm be some m properties and X be an n-element set, where for 
each x € X and each i € {1,...,m} we have that x has property P; or x does 
not have property P;. For a subset S of properties, let N(S) denote the set 
of elements in X that have properties P for all P € S. Note that N(O) = X. 
Clearly, a property P; just corresponds to the subset X; of elements of X having 
this property and N(S) =();-5 Xi. However, taking properties instead of sets 
has the advantage that we can take the same subset more than once. 


Theorem 2.1 (Inclusion-Exclusion Principle). 

Let X be a finite set and P,,..., Pm properties. Further define for SC [m] the 
set N(S) = {a € X | Wie S: ax has P;}, te. the set of all elements from X 
having property P; for alli € S. The number of elements of X that satisfy none 
of the properties P,,..., Pm is given by 


Pe), (1) 


SC[m] 
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Proof. Consider any x € X. If x € X has none of the properties, then x € N(Q) 
and x ¢ N(S) for any S £0. Hence x contributes 1 to the sum (1). 

If « € X has exactly k > 1 of the properties, call this set of properties 
T ¢ (™). Then « € N(S) iff SCT. 

The contribution of x to the sum (1) is ¥g¢7(-1)!/5! = ame (*) (-1)'=0. 

In the last step we used that for any y € R we have (1—y)* = 4 i) (—y)? 
which implies (for y = 1) that 0 = 37*_, (*)(-1)'. 


1=0 \i 


The previous theorem can also be proved inductively. 


In some settings we might be interested in a set of elements having a certain 
set of properties A and none of the properties from a set of properties B. We 
can handle this by considering the opposites A of the properties from A and 
searching for the elements having none of the properties from A and B. In other 
settings we might be interested in the number of elements satisfying at least one 
of the properties. The following corollary answers this question. 


Corollary 2.2. The number of elements of X that have at least one of the 
properties P,,..., Pm is given by 


X= SoS nINGS) = So EDP ais). 


SC[m] DASC[m] 


Given a set X and three properties P,, P:, P3, the Inclusion-Exclusion-Principle 
states that the number of elements not in P; U P» U P3 is: 


|X| — |Pi| — |P2| — |P3| + | Pi Pe| + [Pi P3| + | P20 P3| — (Pin P.M P3|. 


Figure 18 illustrates this: For example, the number of elements in P, \ (P2U P3) 


P3 


Figure 18: An illustration of the inclusion-exclusion principle for three proper- 
ties. 


is subtracted once from the total number of elements. The number of elements 
in (P, N P2) \ Ps is subtracted twice in the beginning (since the elements are 
in P,; as well as in P:) and then added back once. The number of elements 
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in P, 1 P21 P3 is subtracted three times in the beginning, then added back 
three times and finally subtracted yet again. The same holds for the other 
intersections. Altogether one can see that each element in P,;UP:UP3 contributes 
exactly —1 to the total sum. 


2.1.1 Applications 


In the following we apply the principle of inclusion and exclusion (PIE) to count 
things. The arguments have a fairly rigid pattern: 


(i) Define “bad” properties: We identity the things to count as the elements 
of some universe X except for those having at least one of a set of prop- 
erties P,,..., Pm. The corresponding sets are denoted by X1,...Xm (ie. 
X; is the set of elements having property P;). Given this reformulation of 
our problem we want to count X \ (X1,U...U Xm). 


(ii) Count N(S): For each S C [m], determine N(S), the number of elements 
of X having all bad properties P; for 7 € S. 


(iii) Apply PIE: Apply Theorem 2.1, i.e. the principle of inclusion and ex- 
clusion. This yields a closed formula for |X \ (X1U...U Xim)|, typically 
with one unresolved summation sign. 


Theorem 2.3 (Surjections). The number of surjections from [k] to [n] is: 


Proof. Define bad properties: Let X be the set of all maps from [&] to [n]. 
Define the “bad” property P; for i € [n] as “i is not in the image of f”, 
Le. 
f :[k] > [n] has property P; :==> Vj € [k]: fj) 4i. 


With this definition, the surjective functions are exactly those functions 
that have no bad property, i.e. we need to count X \ (X,U...U Xp). 


Count N(S): We claim N(S) = (n—|S|)*, for any S$ C [n]. To see this, observe 
that f has all properties with indices from S if and only if f(i) ¢ S for all 
i € [k]. In other words, f must be a function from [k] to [n] \ S and there 
are (n — |S|)* of those. 


Apply PIE: Using Theorem 2.1, the number of surjections is therefore: 


X\(X1U...U Xa) = SS (-1)/8!n(s) 


SC[n] 
= > (-1!l(n -|s1)4 
SC[n] 
=S\(-1)'(")(n—a* 
rev(") 


In the last step we used that (—1)!5!(m—|S')* only depends on the size of 
S and there are (") sets $ C [n] of size i. 
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Corollary 2.4. (i) Consider the casen =k. A function from [n] to [n] is a 
surjection if and only if it is a bijection. Since there are n! bijections on 
[n] (all permutations) we obtained the identity: 


nl= ye) (n— i)”. 


(it) A surjection from [k] to [n] can be seen as a partition of [k] into n non- 
empty distinguishable parts (the map assigns a part to each i € [k]). Since 
the partitions of [k] into n non-empty indistinguishable parts is counted by 
s!!(k) and there are n! ways to assign labels to the n parts, we obtain that 
the number of surjections is equal to n!s!/(k). This proves the identity: 


nls! (k) = Yi) (") (n—i)*. 


Theorem 2.5 (Derangements revisited). Recall that for n € N, the derange- 
ments D, on n elements are permutations of [n] without fixed points. We claim: 


IDn| = ren'(7) (n— i)! 


Proof. Define bad properties: Let X be the set of all permutations of [n]. 
We define the “bad property” P; that means “z has a fixpoint 2”: 


am € X has property P, :<==> a(it)=i, (Ee [n}). 
Derangements are exactly permutations that have none of these properties. 
Count N(S): We claim N(S) = (n —|S|)! for any S C [n]. 
Indeed, 7 € X has all properties with indices from S if and only if all 


i € S are fixed points of 7. On the other elements, i.e. on [n] \ S, 7 may 
be an arbitrary bijection so there are (n — |.S|)! choices for 7. 


Apply PIE: Using Theorem 2.1, the number of derangements is therefore: 
X\(X,U...UX,)= SS (-1)/8!(s) 


SC[n] 

= Ev - |g)! 
SE[n] 

= ) — a) 
xen) 


In the last step we used that (—1)!5!(n — |S|)! only depends on the size of 
S and there are (") sets S C [n] of size i. 


Theorem 2.6 (Combinations of multisets). Consider a multiset M with types 


1,...,m and repetition numbers r1,...,T%m. Then the number of k-combinations 
of M is: 
m—1 
SC[m] 


where we define (}) :-=0 fora <b. 
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Proof. Define bad properties: Let X be the set of k-combinations where we 
disregard the restrictions the repetition numbers impose, in other words 
X is the set of k-combinations of M, where M is the multiset with the 
same m types as in M but infinite supply of each type (i.e. “rj; = co” for 
each 7 € [m]). 
Recall that by Theorem 1.9, |X| = (es). Define the bad property P; 
as: 


AéX has property P; :<=> type i is repeated in A at least r; + 1 times. 


Then the k-combinations of M are exactly those k-combinations in X that 
have none of the bad properties. 


Count N(S): We claim that for any S C [m/: 


moil+k= eg +1) 
m—1 : 


N(S) = ( 


To see this, note that the repetition number of each type i € S' is at least 
r, +1. This fixes r;+1 elements of the k-combination. In total }),-5(ri +1) 
are fixed this way and what remains to be chosen is a (k — }),¢5(Ti + 1))- 
combination of M. Again by Theorem 1.9 there are (entre Bestar) 
such combinations if k — }0,-g(ri + 1) => 0 and no such combinations 
otherwise. 


Apply PIE: Using Theorem 2.1, the number of k-combinations of M is there- 


fore: 
X\(X1U...U Xm) $7 (-1)/51N(S) 
SC[m] 
_ 15 ma Teh y eatti rl) 
>| 1) ( m—1 ) 


In the special case where all the repetition numbers are equal, i.e. ry = rg = 
... =Tm =, this can be simplified to: 


ye ae ea 


Before we study a more advanced example, we prove a small result needed there: 


Lemma. There are 
that: 


2n neag 


so (°"~") binary sequences (with letters 0 and 1) such 


e The sequence has length 2n and exactly r copies of 1 and 2n — r copies of 
0. 


e No two copies of 1 are adjacent. Here the first and the last position of the 
sequence count as adjacent (the sequence is cyclic). 


For example, if n = 3 and r = 2 there are 9 such sequences, namely: 


101000, 100100, 100010, 010100, 010010, 010001, 001010, 001001, 000101. 
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Proof. Since no two copies of 1 may be adjacent, we know that after each 1 
there must be a 0. So we can just imagine that one 0 is already “glued” to 
every 1 and we are actually building a sequence with 2n — 2r copies of 0 and r 
copies of 10. 

However, one copy of 10 might “spill” across the border, i.e. the 1 could be 
in position 2n and the 0 in position 1. We need to handle this case separately. 


Case 1: The last position of the sequence is 1. Then the first position is 0 and 
the remaining positions contain r — 1 copies of 10 and 2n — 2r copies of 
0, now without any further pitfalls. There are Gaeeeie ‘) ways to arrange 
them. 


Case 2: The last position of the sequence is 0. Then our special character 
10 does not spill across the border and the sequence is any ordered ar- 
rangement of r copies of 10 and 2n — 2r copies of 0. There are Gor) of 
them. 


In total we have: 
2n-—r—-1 2n-r r 2n-—r 2n-9T 2n 2n-91T 
+ = + = : 
r—-1 Tr 2n-—rTr Tr r 2n-91T Tr 


Example 2.7 (Probleme des ménages). There are n > 2 married couples 
at a dinner party, a husband and a wife each, denoted by HMj,...,H, and 
W,...,W,. They should be seated on a round table with 2n (distinguishable) 
seats such that: 


e Men and women alternate, i.e. no two men and no two women sit next 
to each other. Equivalently, the seats with even labels are used either 
exclusively by women or exclusively by men. 


e All couples are separated, i.e. no husband sits next to his wife. 


We want to count how many such assignments exist. 

First count the number of ways to seat the women. They may be seated in 
the even or odd seats and once this is fixed, there are n! ways to seat them, so 
2-n! ways in total. 

Assuming the women are already seated, we now count the number of ways 
the men can be added. It is easy to see that this number does not depend on 
how the women are seated so we can assume without loss of generality that 
wife W; sits in seat 27 and the odd-numbered seats are still free. We count the 
number of ways the men can join using PIE. 


Define bad properties: Let X be the set of all ways to seat the men without 
paying attention to the rule that they must not sit next to their wives. 
There are |X| =n! ways to do it. 


We define the bad property P; that captures that the husband H; sits next 
to his wife W;, i.e. on seat 27 — 1 or on seat 21 +1 (all seat numbers are 
taken modulo 2n, in particular, 2n and 1 are adjacent). 


The permitted arrangements are exactly those with none of the bad prop- 
erties. 
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Count N(S): This time, calculating N(S) for arbitrary S C [n] is tricky. In- 
stead we calculate 


N*(r) = S- N(S), forO<r<n. 


SC[n] 
|S|=r 


which will be just as helpful. The meaning of this number is a bit subtle: 
N*(r) is the number of pairs (7,5) where S C [n] is of size r and 7 is a 
seating plan such that (wife W; sits at 22 and) the couples with indices in 
S are not separated. To better count these pairs, define a map /: 


Under f, the pair (7, S) is mapped to a binary sequence with 2n characters 
and exactly |.S| copies of 1: For 7 € S' (remember that H; and W;, will be 
assigned adjacent places in 7) a one should be put in the position of the 
husband H; — if he sits left of his wife — or in the position of W; in 7, if 
she sits left of her husband. 


It is time for an example. Consider the pair (7, S) with S = {2,5} and 
the seating 
1 2 3 4 5 6 7 8 9 10 
T= (AW, H2W2H,W3H4W1H3Ws). 

Note that the second and fifth couple are indeed not separated (Ws; and 
Hs are adjacent because the table is circular). Also note that the fourth 
couple is also not separated, this is allowed. The mapping f discussed 
above would map this pair to the sequence 0010000001 since Hy» (in seat 
3) and Ws (in seat 10) sit left of their spouse. It is clear that f will 
never produce sequences with two adjacent ones: That would mean two 
adjacent people are both sitting left of their spouse: Which is impossible 
(assuming n > 2). However, any sequence with r copies of 1 and no two 
adjacent copies of 1 is the image of (n —r)! pairs (7, S): From the r copes 
of 1, the set S (of size r) can be reconstructed as can be the position 
of the husbands with indices in S. The (n — r) other husbands can be 
distributed arbitrarily onto the (n — r) remaining odd-numbered seats, 
there are (n — r)! ways to do so. This proves, using the Lemma above to 
count the number of binary sequences of length n with r copies of 1 and 
no two adjacent copies of 1: 


Ae ee eS), 


2n-—7r rT 


Apply PIE: Using Theorem 2.1, the number of valid ways to arrange the hus- 


ol 


bands between the wives is: 


MAK hs Ae LISIN(S 


I 


oP 2 
“a yo 1 ISIN(S 


[S|= 
=S\(-)" SS (8) = DO (-D)N*) 
r=0 SC[n] r=0 
|S|=r 
le “a 2n 2n-r 
=S((-1)"(n Del 3 ). 


Multiplying this with 2n!, the number of ways to place the wives the total 
number of seatings is (for n > 2): 


ont dK 1" (n—r)!- on : ee ” 


2.1.2 Stronger Version of PIE 


Theorem 2.1 can be strengthened as follows: 


Theorem 2.8 (Stronger PIE). Assume f,g : 2I"] > R are functions mapping 
subsets of [n] to real numbers and g can be written in terms of f as: 


= x f(S) (for A C [n]). 


SCA 


Then f can also be written in terms of g like this: 


FA)= SU -D4MISg(S) (for AC [nl). 


SCA 


Before we proof Theorem 2.8, we first observe how it is indeed a generalization 
of Theorem 2.1. So assume we have the setting of Theorem 2.1, i.e. properties 
P,,..., Pm with corresponding sets X1,...,Xm. 

Then for S C [m] define f(S') to be the number of elements having all 
properties with indices not in S and none of the properties with indices in S, 
ie. 


i€[m]\S ies 


Note that f({m]) = |X| —-— | ie mn] X;| is precisely the number we want to 
count in Theorem 2.1. 
The function g fulfilling the requirement of Theorem 2.8 is given as: 


= F(S)=| () Xi] = Nm] \ 4) 


SCA i€[m]\A 
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To see the second “=”, note that })o- 4 f(S) counts an element « if and only 
if the set S of all properties that x does not have is a subset of A. This means 
x is counted if and only if it has all properties from [m] \ A. 

We now apply Theorem 2.8 to obtain: 


F(lm)) = So (-1yr"!g(8) = SO (ym !S ((m] \ 8) 


Cm] SC[m] 


= > (-1)!*'N(S). 


SC[m] 


where in the last step we changed the order of summation, i.e. summed over 
[m] \ S instead of S. This concludes the proof of (Thm. 2.8 = Thm. 2.1). 


Theorem 2.8 establishes a similar result as Mobius Inversion which we con- 
sider later. In both cases there are two “linked” functions f and g where g is 
given in terms of f. The Theorems assert that f can then also be given in terms 
of g. The subset relation “C” will be replaced by the notion of divisibility “|”. 
Both are order relations and, in fact, there are results even more general than 
both Theorem 2.8 and Mobius Inversions, dealing with general partial orders 
(but we will not consider them here). 

We now proof Theorem 2.8. 


Proof. Consider (for A C [n]) the term that is claimed to be equal to f(A): 


do (ala igs) = So (ala f(D) = SO er f(T) 


SCA SCA TCS TCA 


for appropriate cp (to be determined!), that captures how often f(T) is en- 
countered (for T C A). Observe c4 = 1, since f(A) is only encountered for 
T =S =A. For a proper subset T C A (and k := |A| — |T|) we have: 


k 
= corn asene(!) <0 


TCSCA i=0 


where in the second step we observed that picking a set between T and A is 
equivalent to picking a subset of A\ T. The last step is an identity we already 
saw. 

This proves the claim. 


2.2. Mobius Inversion Formula 


Any positive number n € N has a unique decomposition into primes, i.e. n can 


be written as n = pi! - p5?-...-pz* for primes pi,...,px with multiplicities 
a1,...,@,%. In other words, the factors of n are given as a multiset M(n) with 
types pi,...,px and repetition numbers aj,..., ax. 


For example 360 = 23-3?-5 and M(360) = {2, 2, 2,3,3,5} = {3-2,2-3,1-5} 
(which admittedly looks a bit quaint written that way). 

We write n|m if n divides m, meaning there is a number a € N such that 
a:-n=m. In terms of multisets this means M(n) C M(m). 
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The greatest common divisor (ged) of m and n, is the number k = gcd(m,n), 
with M(k) = M(m)NM(n). For example gcd(12, 90) = 6 since M(6) = {2,3} = 
{2, 2,3} 7 {2,3, 3,5} = M(12)  M(90). 

Note that M(1) = @ (1 is the result of the empty product). If ged(m,n) = 1, 
then m and n are called relatively prime (or coprime). 

We define the Euler’s ¢-function for n > 2 as: 


o(n) = #{k EN|1<k<n,ged(n,k) = 1} 


For example, (12) = #{1,5,7,11} = 4, (9) = #{1,2,4,5,7,8} =6. A 
formula to compute Euler’s ¢-function is given in the following Theorem. 


Theorem 2.9. [fn = p{'ps?... py" then 
ab 
o(n) =n] [a-—) 
i=1 Pi 


Consider for instance n = 12 = 27-3. Then $(12) = 12-(1- $)- (1-4) = 
12- $ : 2 = 4 confirming what we counted by hand before. 


Proof. In the proof we use the principle of inclusion and exclusion. 


Define bad properties: As ground set we use X = [n]. We say a number 
x € X has the bad property P; if x is divisible by p,, i-e.: 


x € [n| has property P; :<=> pj|x. (4 € [k]). 


Then the numbers in [n] that are relatively prime to n are exactly those 
that have no bad property. 


Count N(S): For any S C [k] we have N(S) = Tr mere 
te ze 


This is because, if x € X is a multiple of all primes with indices from S, 
then x must be a multiple of their product [];-g pi (which is the least 
common multiple of those primes). Since this product divides n, there are 
no rounding issues and the number of numbers x between 1 and n that 
are multiples of [];-, pi is just given as the quotient. 


Apply PIE: Using Theorem 2.1, we can write ¢(n) as: 


ISI 
o(n)= S$) -n5IN(8) n]Ja-?). 


SCIk] SC[k] ie Pi i€[K] 


The last identity is best seen by multiplying out the right pide’ For each 


of the k factors we can either choose 1 or we can choose —=-. The indices 


i where —+ was chosen are captured in the set S$. For each S C |{k] we 


get exactly the term under the sum. 


We now show a few more number theoretic results leading up to Mobius Inver- 
sions. 


Theorem 2.10. n = >> ¢(d). 
d|n 
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Proof. We claim that for a divisor d of n the sets {x € [n] | gcd(x,n) = d} and 
{y € [4] | gcd(y, 5) = 1} have the same cardinality. Indeed, it is easy to see 
that r++ y ‘= 4 is a bijection. 


This means #{x € [n] | gcd(a,n) = d} = $(4). Summing these identities 
for all d|n yields: 
n=) 794) 
d\n 


This is almost identical to the claim, the only thing left to do is to change the 
order of summation: Substitute d with d’ := 5 and note that d’ divides n iff d 
divides n. 


We now define the Mobius Function for d > 1 as: 


1 dis the product of an even number of distinct primes 
u(d) == <—1 dis the product of an odd number of distinct primes 
0 otherwise 


For example, (15) = 1, u(7) = (30) = —1, w(12) = 0, since 15 = 3-5 is the 
product of two distinct primes, 7 = 7 and 30 = 2-3-5 are the product of an 
odd number of primes and 12 = 2-2-3 is not the product of distinct primes: 
We need 2 twice. The numbers n with p(n) 4 0 are also called square free since 
they do not have a square as a factor (12 is not square-free since it has 4 as a 
factor). 

Note that (1) = 1 since 1 is the product of 0 primes and 0 is even. 


Lemma 2.11. 
1 ifn=1 
So u(d) = 
‘dm 0 ifnFAl 
Proof. If n = 1 this is obvious since d = 1 is the only divisor of 1 and p(1) = 1. 
For n > 2 we write n = p{'...pi*. Then we have: 


k 
~eM= SP w= DP Cvlrl=S0 (*) (-1)' =0. 
dln 


d\(pip2-.-Pk) DC{k] i=0 


In the first step remember that j.(d) = 0 if d is not square-free. This means if 
d contains a prime factor more than once, it does not contribute to the sum. 
For the second step, realize that a divisor d of p1 ... px is just given by choosing 
a subset D of those & primes and multiplying them. Then ju(d) will be 1 if an 
even number of primes where chosen and —1 otherwise. The last two steps are 
identities we already encountered earlier. 


Corollary 2.12. 


an) _ 5s ula), 
n qi 


Proof. We use an identity that came up in the proof of Theorem 2.9 and argu- 
ments similar to those from the last Theorem. 


= ud) p(d) 
=H =" 


d|p1..-Pk d\n 


— 1/5! 


k 
a(n) =nJJa-2)=2 > 


Sch] Lies Pi 
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We now have all ingredients to prove the Mobius Inversion Formula: 


Theorem 2.13 (Mobius Inversion). 
Let f,g:N->R be functions satisfying g(n) = >> f(d). Then: 
dln 


Proof. We start by changing the order of summation (d > 4%) and using the 


d 
definition of g. 


d|n dln 
= 5H) 0 Fe) 
d\n d'|d 
= Soca: f(a) 
d'|n 


Where cq are numbers (to be determined!) that count how often f(d’) occurs 
as a summand. Note that c, = u(1) = 1 since f(n) occurs only for d’ =d=n. 
For d'|n, d’ 4 n we have: 


ca = D> MG) = D7 ulm) = 0. 


d'|d|n m| ar 


Where we substituted the summation index d'> m := 4 and used Lemma 2.11 
in the last step. This proves the claim. 


Example 2.14 (Circular Sequences of 0’s and 1’s). Circular sequences are for 
example: 


Circular sequences are considered equal if they can by transformed into one 
another by rotation. We would write 


A = 001011010 = 110100010 = B £ C = 100100100. 


Let N,, be the number of circular 0/1-sequences of length n and M(d) the 
number of aperiodic circular sequences of length d where a sequence is called 
aperiodic if it cannot be written as several times a shorter sequence. For exam- 
ple, C is not aperiodic since it consists of three copies of “100” while A (and 
therefore B) are aperiodic. 

For every circular sequence S of length n there is a unique way to describe 
it as “repetitions of S’” where S’ is an aperiodic circular sequence. This is not 
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trivial, but also not hard to see. Clearly, the length of S” must divide n. So we 
found: 


Nn =), M(d). (x) 
dl 
Consider Table 2 for an example with n = 6. 
d M(d) corresponding sequences 
1 0,1 000000,111111 
2 Ol 010101 
3 001,011 001001,011011 
6 000001,000011,000101, 000001,000011,000101, 


000111,001011,010011, 000111,001011,010011, 
001111,010111,011111  001111,010111,011111 


Table 2: Aperiodic sequences of all lengths that divide 6. There is a bijection 
between them and the circular sequences of length 6. 


Another crucial observation is that the number of aperiodic linear sequence 
of length d is just d- M(d) (every rotation of an aperiodic circular sequence) 
and therefore the number of all linear sequences of length n can be written as 
“repetition of some aperiodic linear sequence” so as: 


2” = S°d- M(d). (A) 
d| 


Now define g(n) = 2” and f(d) = d-M/(d). These choices of f and g fulfill 
the requirement of the Mobius Inversion Formula so we obtain: 


f(ny=S~ p(d)g(4) 


d|n 


 nM(n) = S~ y(d)2 


d|n 


so Nu 297 M(@) =) 4D mrt = 4 aly! 


d|n d|n Id d|n Id 


= De amP)2 = 7 DT g(a = D7 DT erw(e2! 


In Id|n in 1,4)" In k|# 


als 


n 
U 


I I I 
oe aie eae ce 


In k|® In l In 


While not overwhelmingly pretty, at least our result is an explicit formula only 
involving one sum and Euler’s ¢-function. This is as good as it gets. 
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3 Generating Functions 


In the following we consider sequences (@n)neNn = 40,41, 42,... of non-negative 
numbers. Typically, a, is the number of discrete structures of a certain type 
and “size” k. 

The generating function for (dn)nen is given as F(x) = S7°° 9 dnx”. Despite 
the name, a generating function should not be thought of as a thing you plug 
values into: It is not meaningful to compute something like F'(5): In fact, these 
values are often not well-defined because the sum would be divergent. We care 
about the thing as a whole: You can either think of generating functions as 
functions that are well-defined within their radius of convergence (some area 
close to zero) or you just think of the “x” in F(a) as an abstract thing (not a 
placeholder for a number) which makes F(x) an element of the ring of formal 
power series. In the following, we ignore all technicalities and boldly apply an- 
alytic methods as though a generating function were just a simple well-behaved 
function. And it just works. If you think this is all a bit arcane, try to look at 
it this way: 

At first, a generating function is just a silly way to write a sequence (instead 
of ap = 1,a1 = 42,a2 = 23,... you would write A(z) = 1+ 422 + 2327 +...). 
Some operations on the sequence have a natural correspondence: For example, 
shifting the sequence (ao 0, ay 1,a2 42, a3 23...) corresponds to 
multiplying A(x) by «. Some complicated operations on sequences suddenly 
become simple and natural in the world of generating functions where analytic 
tools are readily available. 

When dealing with generating functions F(z) = > 9 an2”" and G(x) = 
rg bnw”, we freely use the following operations: 


e Differentiate F’ term-wise 


oo oo 
F'(az) = S- nanw”! = So(n + Lani 2” 
n=1 n=0 


e Multiply F(a) by a scalar \ € R term-wise 


AF (x) = ss Nay x”. 


n=0 
e Add F(x) and G(a) 
F(x) + G(a) = xe + bp)a”. 
e Multiply F(x) and G(c) 
F(2)-G(2) = \ (>: exh) a”. 
n=0 \k=0 


We will later see that the coefficients of the product sometimes count 
meaningful things if a, and b, did (see Example 3.2 and 3.3). 
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Example 3.1 (Maclaurin series). 
(i) Consider (@p)nen with a, = 1 for alln € N. The corresponding generating 
function F(z) =1+a+2?+2?+... is called the Maclaurin series. 
We claim F(x) = =, which looks a lot like the identity for an infinite 


a 


geometric series. To verify this, just observe that the product of F(a) and 
(1 — 2) is one: 


(l-a#)F(z)=(tet¢e?+2°4+..) -@ter?te+..JHL 


(ii) Differentiating F(x) = + with respect to x on both sides yields: 
= f 
F’ 1 OT 
@)= Lint de" = Gop 


So we found the generating function for (b,)nen with b, =n+1. 


(iii) Substituting « = —y in the Maclaurin series gives the generating function 
of the alternating sequence (i.e. an = (—1)”): 


How do we count using Generating Functions? 


Example 3.2. Let al) be the number of arrangements of n unlabeled balls in 
k labeled boxes and at least 1 ball per box. For each k, this gives a generating 


function: 
Co 


BY) (2) = Ss ale”, 


n=0 
Considering the case with only one box, we have 


= 0 forn=0 
i 1 forn>1. 


So B(x) is just the Maclaurin Series from 3.1(i) shifted by one position (i.e. 
multiplied by x), meaning: 


BY (c)=O0+at+e? +a +...=2-S a= 


The key observation we make now is that multiplying two generating functions 
has a meaningful correspondence in our balls-in-boxes setting: 
For two numbers of boxes s and t, we have the identity: 


alstt) — =yee (s) al’), 


This is merely the observation that arrangements of n balls in s + ¢ boxes are 
given by arranging some / balls in the first s boxes and the remaining n—/ balls 
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in the remaining t boxes. Looking at the right side of the equation, note that 
these numbers are exactly the coefficients of the product B“) (x) -B (x) (check 
this!). So we obtained: 


Bls+4) (z) = B&) (a) - BM (cz) 


B%(2) = (2%) = (72). 


However, we want to write B™) (x) in the form >, at") x” to see the coefficients 


a). To do this, note that deriving k — 1 times the term (1 — x)~! yields 
(k —1)\(1—2)-*. Using this we obtain: 


ok ak qk-1 1 ak qk-1 & 
a an ~ (kD! _ 3) ~ (k= mite 1 ee 


and therefore 


(n-k+2)e° 
an 
= = n! ntl 
Para eet ee so 
= = n des PR n-l\ , 
se Fe ee 
n=k-1 n=0 


where in the last step use that @) = 0 if a < b and additionally make an index 
shift of 1. 

So we found a new proof that the number of arrangements of n unlabeled 
balls in k labeled boxes and at least one ball per box is al) — ae 


Example 3.3. We want to count integer solutions for a+b+c =n with a non- 
negative even integer a, a non-negative integer b and c € {0,1,2}. Equivalently, 
we can think of this as counting arrangements of n unlabeled balls in boxes 
labeled a, 6 and c where box a should receive an even number of balls and c at 
most 2 balls. 

We first consider the more simple situations where only one of the vari- 
ables/boxes exists: 


e Solutions to a = n with even a. Clearly, there is a unique solution for 
even n and no solution for odd n. The corresponding generating function 


is: 
loc) 


= oa = te = 


n=0 


e Solutions to b = n where b is an integer. Clearly, there is exactly one so- 
lution for each n. The corresponding generating function is the Maclaurin 


series: 
CoO 
— ; ge = 
n=0 
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e Solutions to c = n where c € {0,1, 2}. Clearly, there is exactly one solution 
if n € {0,1,2} and no solution otherwise. The corresponding generating 
function is: 

C(a)=1+2+27. 


With the same argument as in the previous example, multiplying the generating 
functions yields the generating function of the sequence we are interested in, so 
we calculate: 

(1+a2+4+ 27) (1+a2+4+27) 


EUS oe EES OG = aaa = a) ela 


We would like to write this as a linear combination of generating function we 
understand well, so we search for real numbers R, $,T with: 


(l+a2+27) R Ss T 


aed =ee ise: | 


se ie eee 


Multiplying by the denominators yields: 


(l+2+27)=R(1-2)?+S(1-—27)+T(1+2) 
We compare the coefficients in front of 1, x and 2? and get the equations 1 = 
R+S54+T7,1=-2R+T,1= R-—S. This system has the unique solution 
R=4,8=-3,T=3 
4) 4? 2° 
Using this and the identities we obtained in Example 3.1 yields: 


ee tToe 2a cep I! 1)"a"—3 Sa" +3 S$ (n+1)a". 
n=0 


So the coefficient of x”, and therefore the number of solutions to a+b+c=n 
we want to count is: 


F(z)= 


(-1)" 3, 3(n+1) 
4 4° 2 

Example 3.5. We want to count the number a,, of well-formed parenthesis 
expressions with n pairs of parenthesis. For example (()())() is a well-formed 
expression with 4 pairs of parenthesis but )())(( is not. Formally, a permutation 
of the multiset {n- “(",n- “)”} is well-formed if reading it from left to right and 
counting “+1” for every “(” and “-1” for every “)” will never yield a negative 
number at any time (no prefix contains more closing than opening parenthesis). 

Every well-formed expression with n > 1 pairs of parenthesis starts with “(” 
and there is a unique matching “)” such that the sequence in between and the 
sequence after is a well-formed (possibly empty) expression. For example: 


(OOO (COCO) OW) 


In other words, a well-formed expression with n pairs of parenthesis is obtained 
by putting a well-formed expression with & pairs in between “(” and “)” and 
then append a well-formed expression with n — k — 1 pairs of parenthesis. This 


gives the equation: 
n—-1 
an = S Akan—k-1 
k=0 
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So if F(x) is the generating function belonging to a,, then we know: 


fore) co n-l ioe) n 
F(x) = S- Qnx" =1+ YO: GpAn—k—-1)@" = 1+ SOS Apan—p)att 
n=0 n=l k=0 n=0 k=0 


=l+a- aoe ApOn—~)z" =1+a- F(z)’. 


n=0 k=0 


And with analytic methods we can find a solution to this as: 


1—V1- 42 


Hla) 2x 


So far it is unclear what this means for the coefficients of F'(a) (and therefore 
for the number of well-formed expressions). It seems we need additional tools 
to understand the power series of 1 — 4a. 


3.1 Newton’s Binomial Theorem 


Recall the following special case of the Multinomial Theorem (Theorem 1.6). 


1l4+a)"= ee ee Vn EN. 
(1+2) at a; 
Where, as usual, (7) =0 for k > n. This shows that (1+ .)” is the generating 
function for the series (ax)cen with az = Ge 

We extend this result from natural numbers n € N to any real number n € R. 
To this end we first extend the definition of binomial coefficients. 


Definition 3.7 (Binomial Coefficients for Real Numbers). Recall that for in- 
tegers n,k © N we have: 


(ee ERE n! 


k! k! k(n — k)! 


It does not make sense to talk about permutations of sets of size n € R and it is 
unclear what n! should be, but the formula in between is well-defined for general 


n € R. With this in mind, we define p(n, k) = n-(n—1)-...-(n—k +1) and 
@) = ae). Note that the new definition matches the old one if n is integer. 


We can now talk about numbers such as “—7/2 choose 5” by which me mean: 


i) Pee _ 9009 


5 5! 256 


Note that for n € R we have p(n, 0) = 1 and for k > 1 the recursions: 


p(n, k) = (n—k+1)-p(n,k-1) (x) 
=n-p(n—1,k—1). 


Given our extended definition of binomial coefficients, we can state the following 
Theorem (but will omit the proof). 
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Theorem 3.8 (Newton’s Binomial Theorem). For all non-zero n € R we have: 
voor Ee 


Setting n = 1/2 yields an identity for 1+. that we require to proceed 
in Example 3.5. But before we can use it, we need to better understand the 
coefficients of the form (i), 


Lemma 3.9. For any integer n > 1 we have 


ee at 


Proof. We do induction on n. For n = 1: 


1/2 so SE os files 
1 2 


For the induction step (n > n + 1) we use recursion (x) from above: 


( 1/2 ) — p(l/2n+1) _ (1/2-(n+1)4+1)p(1/2,n) sn -1/2 ) 


n+1 (n+ 1)! (n+1)-n! nt+1\n 
m= 1/2) yng (2n-2)_1_ 1 
n+1 n—-1/]22r-l n 
_2n I-41) yyneo(2m-2) 1d 
2n = 2n n—-1/22r-l n+l 
= (-1)"2 (2n —2)'(2n—1)(2n) 1 1 


(n—1)"(n-—1)-n-n 2241 n+l 
——S SS 
() 
Proposition 3.10. Using the last Theorem and Lemma we obtain: 
3.8 wo (1/2 3.9 = 2n — 2 1 
1 — n= 1 =9 ee 
Vita ee te i, Cs 
Example 3.5 (Continued). Using the proposition we are now able to find the 


coefficients of the generating function from Example 3.5 above, i.e. the number 
of well-formed parenthesis expressions. 


Sle 
8 


1—VJ1—4¢ 310 1 CG. f2n-2 i ea 
F(z) = = 2 —1)"——(-42)” 
(x) Qn gine n—-1 ( oan as v) 
1 /2n—-2\ 1 ey (2 1 
r “—~ n—-lj/n oer n+1 


The numbers C,, := Cy are called Catalan Numbers. They do not only 


count well-formed parenthesis expressions but occur in other situations as well. 
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3.2. Exponential Generating Functions 


Until now we mapped sequences to functions like this: 


(an)nen > S- anx” € Ria] 


n=0 


In other words, we used the coefficients a, to obtain a linear combination of the 
basis {2" }nen of Ria]. 

Our choice of a basis was useful in the cases we considered, but we can 
consider other bases as well that will be useful in other situations. The following 
sets are all bases of R[a]: 


i) as a 


In the last case we get from any sequence (a@p,)nen a corresponding Dirichlet 
series ))~ 9 %. Such series are important in algebraic number theory (in that 
setting the variable is typically called s instead of x). As an example, we state 


the following result (without proof). 


Theorem 3.11 (Euler Product). The Dirichlet series for (u(n))nen satisfies 


n=0 i G(s) p prime 
: : : So 
where ¢(s) is the Riemann zeta function ¢(s) = S> —. 
n=o 2° 


We will not examine Dirichlet series further, dealing with exponential gener- 
ating functions instead. In contrast to ordinary generating functions (the kind 
we considered before) the basis is not {x”"}nen but {=> }nen and instead of 
counting arrangements of unlabeled objects (unlabeled balls in labeled boxes, 
solutions to aj +a2+...+a; =n with constraints, indistinguishable parenthe- 
sis in an ordered string), exponential generating functions are useful to count 
arrangements of labeled objects (permutations, derangements, partitions, ...) 
as we will see shortly. 

Before we get started, note the exponential generating functions A(x) and 
B(x) of (@n)nen and (bn )nen with a, = 1 and b, = n! (n € N) are 


A(x) = S- — =" 
n=0 . 

B(x) = Sone - yea - — 
n=0 : n=0 


The following two observations show the connection of exponential generating 
functions to arrangement of labeled objects. 


Observation 3.12. Say the numbers (a,)nen and (bn) nen Count arrangements 
of type A and B, respectively, using n labeled objects. 
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Assume arrangements of type C with n objects are obtained by a unique 
split of the n objects into two sets and then forming an arrangement of type A 
with the first set and an arrangement of type B with the second. 

Then c,,, the number of arrangements of type C and size n, is given as 


n 


Cn = S- @ Qk: bn—k- 


k=0 


Crucially, the exponential generating functions A(z), B(x),C(x) of the three 
sequences reflect this relationship as C(x) = A(x)- B(x), which is easy to verify: 


A(x) - B(x) = & ot] (>. 2] =. (>: oot] 2” 
n=0 : n=0 : n=0 \k=0 ~~ : 
n=0 \k=0 . 


Observation 3.13. An index shift by one in the sequence corresponds to a 
derivation, meaning 


Co = co 

F = x derive F’ _ ge = rg 

@)= Deana oe (a) = Dan ay = DL any 
n=0 n=1 n=0 


We already encountered Stirling numbers before. We distinguish three kinds: 


e The unsigned Stirling numbers of the first kind s1(n) fulfill the recursion 


si (n) = (n— 1)sK(n — 1) + 5h_1(n — 1) 
and count the number of n-permutations of [n] with k cycles. 
e The signed Stirling numbers of the first kind are simply 
54 (n) = (-1)""*sz(n). 
e The Stirling numbers of the second kind s{!(n) fulfill 
sp (n) = ksq!(n — 1) + sh 4(n — 1) 
and count the number of partitions of [n] into k non-empty parts. 


Theorem 3.14. Fork € N, we have the following identity for the exponential 
generating function F,(x) of the Stirling numbers (st! (n))nen: 


= — {hist dng k 
fee) = 2 (n= gle? - UF. 


Proof. We proceed by induction on k. For k = 1 we have: 


1 forn>1 
II os z 
*k m= forn =0 
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and the claim follows from the identity 37 => =e” —1. 


nl 


n=1 
If k > 2, we use the recursion of Stirling numbers from above: 
syi(n + 1) = kg! (n) + sha (n). 


Taking the generating functions for each term (remember that derivations cor- 
respond to index shifts!) and then applying the induction hypothesis yields: 


1 
F(x) =k.- Fy (a) + Fy-1(2) i KF, (a) + —1i — yes 
This differential equation has the solution 
i aa k 
Fy(x) (e = 1) 


~ kl 


as claimed, where we used F;,(0) = 0. 


Note that solutions to differential equations may be hard to find but are easy 
to verify. Finding the solutions is not the topic of this lecture. It is perfectly 
alright if you use computers (or techniques from other lectures). 


To later obtain the exponential generating function for Stirling numbers of 
the first kind, we first prove two identities: 


Lemma 3.15. 


(2) S- si(n)x* = p(x +n—1,n). 
k=0 


(ii) S- ai(n)x* = p(a,n). 
k=0 


Proof. Note that p(az +n —1,n) is a polynomial of degree n with variable x so 
there are constants b,, such that p(a+n-—1,n) = 5 bnpx*. They fulfill 
the following equation: 


Yo bn,no® = p(w +n—1,n) =(etn-1)-p(etn—2,n-1) 
k=0 


n—1 n n—1 
= (a +n — 1) ss beer => Ss be Ajnpe + (n = 1) ye bape. 
k=0 k=1 k=0 


Comparing the coefficients of x, we get: 
bn, k = bn—1,k—-1 + (n an 1)bn-1,k 
where we define 6, , = 0 fork >nork <0. 
I 


So the numbers },,, fulfill the same recursion as s;,(m) and since the starting 
values s(0) = boo = 1 match as well, this proves bn, = s1(n) and thus (¢). 
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For (iz), plug —a into the equation above: 
n n 
S>(-1)¥ si (n)2* = p(- x+tn-—1,n) y= TI —z+n—k 
k=0 k=1 


Desa Ee (—1)"p(a, n). 


Multiplying by (—1)” gives the desired result: 


n 


do (-1)" "sf (n) 2* = p(z,n). 


k=0~ 
iz 5; (nm) 


Theorem 3.16. The exponential generating function of (81 (n))nen fulfills: 
pp ve” 1 k 
n! ! 
Proof. One way to expand (1+ x)” 
1 
(1 +2)? = e842) = S° — (log(1 + 2))*z 


On the other hand, we can also expand (1+ 2x)* using Newton’s Binomial The- 
orem and the last Lemma: 


z S\ p(z,n) re Lee 
3.8 8 3.15 =f k 
tay Eo (Zor = yo Peon SS shine 
n=0 n=0 : n=0 " k=0 
= a(n) 2" = Ss (> Hy] af 
Tr. “e=01e=0 k=0 \n=0 
si (n)=0 
fork >n 


We have written (1 +2)* in two ways and the coefficients of z* in both repre- 
sentations must match. This proves the claim. 


3.3. Recurrence Relations 


We have already seen (in the case of Stirling numbers) that relations between 
numbers in a sequence may fully characterize the sequence (if some starting 
value is given). A sequence (@n)nen is defined recursively if 


an = f(@n-1; An—2)+++5 An—k) 


for some function f, a positive integer k, and all n > k. This identity a, = 
f(G@n—1,4n—2;--+;@n—k) is called a recurrence or a recurrence relation or a re- 
cursion of order k. If ag, a1,...,@—1 are known, they are called initial values 
or initial conditions. The sequence (an)nen is then called a solution of the 
recursion f with initial values ao,...,@,—1. 

In the following we examine special cases of such recurrence relations and 
show how the sequence can be derived from them. 
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Example 3.17 (Fibonacci Numbers). Rabbits were first brought to Australia 
by settlers in 1788. Assume for simplicity, the first pair of young rabbits (male 
and female) arrives in January (at time n = 1). A month later (n = 2) this 
pair of rabbits reaches adulthood. Another month later (n = 3) they produced 
a new young pair of rabbits as offspring. In general, assume that in the course 
of a month every pair of young rabbits grows into adulthood and every pair of 
adult rabbits produces one pair of young rabbits as offspring. If F,, denotes the 
number of rabbits at time n, then we have F,, = F,-1 + Fn—2, since exactly 
the rabbits that already existed at time n — 2 will be adults at time n — 1 and 
produce offspring between time n — 1 and n. This gives sequence: 


Fo =0, F, =1, R=1, Fh =2, Fy =3, Fe =5, Fe =8, Fy = 13, FR=21, ... 


The sequence (Fi,)nen is called the Fibonacci sequence and its elements are 
the famous Fibonacci numbers. Rabbits in Australia are now a serious prob- 
lem, causing substantial damage to crops and have been combated with ferrets, 
fences, poison, firearms and the myxoma virus. 


Photos taken by Daniel Oines and Esdras Calderan 


Figure 19: Spiraling patterns can be seen in many plants. The pine cone on the 
left has 8 spiral arms spiraling out counterclockwise and 13 spiral arms spiraling 
out clockwise. In the sunflower, different spiral patterns stand out depending 
on where you look. Close to the center, a counterclockwise spiral with 21 arms 
and a clockwise spiral with 34 arms can be seen. Closer to the border another 
counterclockwise spiral with 55 arms emerges. Fibonacci numbers everywhere! 
We’re not making this up, zoom in and count for yourselves (or better yet, go 
outside and look at nature directly), the patterns are really there! 


Fibonacci numbers also frequently occur in plants (see Figure 19). In case 
you do not know Vi Hart (seriously?!) you should definitely check out her 
Youtube Channel and watch her videos on this topic. 


Another example where Fibonacci numbers occur is tilings of checkerboards 
of size n x 2 with dominoes as shown in Figure 20. Let c,, be the number of 
such tilings. Note that there are two possibilities to cover the rightmost column 
(see Figure 21: Either with a vertical domino, then a board of size (n — 1) x 2 
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Figure 20: A tiling of the 9 x 2 board with dominoes. 


remains to be tiled, or with horizontal dominoes, then a board of size (n— 2) x 2 
remains to be tiled. This means cy, = Cn_1 +Cn—2 and together with cy = 1 and 
co = 2 (and setting co = 1) this implies c, = Fy41. 


Figure 21: The last column can either be covered by a vertical domino or by 
horizontal dominoes. 


Example 3.18 (Ternary n-strings without substring 20). Consider strings 
over the alphabet {0,1,2} that do not contain the substring 20, for instance 
0010221021 or 1111011102. Let t,, be the number of such strings of length n. 
We have to = 1 (the empty string), t; = 3 (each string of length 1), and tg = 8 
(all strings of length 2 except for 20). 

In general, an n-string without 20 consists of an (n — 1)-string without 20 
and an additional digit, which gives 3-t,_1 possibilities. However, this may 
produce strings that end in 20 (but have no 20 in other places): There are ty_2 
of them. 

This gives ty, = 3-tn—1 — tn—2. 


The Fibonacci sequence and the sequence (t,,)nen from the last example have 
in common that they are solutions for some recurrence relation of order 2, i.e., 
their elements can be described in terms of the previous two elements. Moreover, 
each element could be expressed as the sum of multiples of previous elements. 
We now develop a general theory dealing with such recurrence relations. 


Definition 3.19. A recurrence relation of the form 
An = C1(N)an—1 + Co(N)an_2 +--+ + ex (N)an—~ + g(n) 


for functions g: N > Randc; :N > R,i=1,...,k, is called a linear recurrence. 
In case g = 0 (that is, g(n) = 0 for all n € N), the recurrence is called linear 
homogeneous, otherwise linear non-homogeneous. 


In this chapter we shall treat the case when the coefficients c;’s are constants, 
so 
Qn = C{An—1 + Co2Qn—2 +°++ + CKAn—k.- 


We assume that c, 4 0, otherwise we treat the recursion as one of a smaller 
order. 
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The Fibonacci sequence fulfills a homogeneous linear recurrence relation with 
constant coefficients 
Fy = Fh-1 + Fr—2 
where k = 2,cy =1,c2 =1,g =0. 
The same is true for t,, from Example 3.18 


tn = 3tn—-1 — th-2 


where this time k = 2,c, = 3,co = —l1,g=0. 


3.3.1 Special Solution a, = x” and the Characteristic Polynomial 


Throughout this section assume we are given a linear homogeneous recurrence 
relation f of the form 


An = C{An—1 + C2Qn—2 +... + CKGn—k 


of order k. If one replaces a; with x’ and plugs it in the recursion, the following 
is obtained: 


gr = eet oa Wess eg, 


Dividing by «”~* and bringing terms to one side of the equation gives: 
g by ging & 


rages cat} _ Gale? —----c =0. 
Solving this equation allows to find some solutions. The characteristic polyno- 
mial of f is given by 


1 k-2 


p(x) = 2* — ea*- — con” “— +++ —cCp_1@ — Ch. 


We see that a homogeneous linear recursion could be reconstructed from its 
characteristic polynomial by replacing x’ with an_—4+4:: 


Gn — C1An—1 — C2An—2 —***— CeAn-p = OD, 


av* —ca*} cor? “+ —Cp_jx—ce = O. 


Note further, that we can assume that all the roots of the characteristic poly- 
nomial are non-zero because cy, # 0. 

In this section we shall present some methods to solve linear homogeneous 
recurrences and some specific non-homogeneous recurrences. 


Lemma 3.20 (Linearity Property). Let f be a linear homogeneous recurrence 
with constant coefficients. If (Wn)nen 18 a solution of f with some initial values 
and (dn)nen ts a solution of f with some initial values, then for any constants 
B,6 ER, (Bbn + 6dn)nen ts a solution of f with some initial values. 


Proof. Let an = f(@n—1,---;@n—k) = C1dn—1 t+ ++ +CKAn—~. Thus by = c10n—1+ 
-+++cpbn_p and dp = cydy_1 +--+ + cedn—p for all n > k. Multiplying the firs 
equation by 3, the second by 6 and adding them gives 6b, + ddy = c1(8bn—1 + 
ddn—1) eter ce (Bbn—k te Od pile 


a ad 


Lemma 3.21 (Divisibility of Polynomials). Let p(x) be a polynomial. Then p(x) 
is divisible by (a—q)™, for a positive integer m and a nonzero constant q if and 
only if the following polynomials are divisible by (a —q): po(x) = p(x), pi(x) = 
xpo(x), po(x) = xpi (@), «++, Pr(t) = eP__y(®), +++ Pm—1(®) = Pina (2): 
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Proof. Assume first that p() is divisible by (x—q)'™. Then p(x) = (w—q)'t(x), 
where t(x) is a polynomial. We use induction on k to show that (2 — q)™~* 
divides py, k = 0,...,m—1. When k = 0, (a — q)°"~®) divides po(ax) = p(z). 
Assume that (a —q)™~**+ divides py_1(x). Let us show that (x—q)™~* divides 
p(x). We have then that p,_1 = (x—q)'™~**!s(a) for some polynomial s. Then 
Pe(&) = Upy_y(@) = & ((m—k +1)(x —q)™"*s(x) + (w@— g)™"**15"(a)), 80 De 
is divisible by (a — q)™~*. 


Now, assume that each of po,...,Pm—1 is divisible by x — q. We shall show 
that pm—z(x) is divisible by (x — q)*, k = 0,...,m— 1, so in particular that 
po(x) is divisible by (a — q)™. We do this by induction on k. When & = 1, 
the statement holds by assumption. Now, assume that pn—,%(x) is divisible 
by (x — q)*, let us prove that pm—%—1(x) is divisible by (x — q)*t+. Assume 
not, ie., that pm—p—i(z) = (a — q)*t(x), t(q) # 0, 2 < k. Then py_z(x) = 
x(a — q)*—1(ét(x) + t'(x)(x — q)), so the highest power of (x — q) dividing pn_, 
is 0—1<k. We see that pm_xz(z) is not divisible by (x — q)*. 


Lemma 3.22 (Fundamental Solution). Let f be a linear homogeneous recur- 
rence with constant coefficients. If q is a root of the characteristic polynomial 
of f, then (q”)nen is a solution of f for some initial values. Moreover, if q is 
a root of the characteristic polynomial of multiplicity s, s > 1, then (n*q”)nen, 
0<i<s-—1 is a solution of f. 


Proof. Let an = f(Qn—1,---;Qn—k) = C1Gn—1 +°+++C~Gn—~% and b, = q”, where 
q is a root of the characteristic polynomial, so g® — cyq*-! — --- — egq® = 
Multiply the last equality by g’—* to obtain q” — c,q"~! —---—czq”* = 0. So 


(q”)nen is a solution of f. 


Now, assume that q has multiplicity s, ie., the characteristic polynomial 
pi(xz) has a form pi(2) = (a — q)*p(x), where p(x) is a polynomial. Observe 
that gq is a root of each of the following polynomials: p2(x) = p;(x)x"~", p}(a), 
pa(x) = aph(2), py(x), pa(a) = xph(a), etc. till p. a(x). Indeed, the (x — q) 
term remains after performing the product rule. So, in particular, for i < s 


We have 
Ce a gee a 
po(2) = norl— C1 (n _ Dae fash ce(n _ a 
p3(x) = nx— c(n = Det oon Se ce (n _ k)x?—*, 
p(x) = 2a! — y(n —1)22"-? —--- — eg (n — k)2x™*-}, 
pa(z) = nen” — ci(n — 1 ee Suede n= kya", 
Di+2(L) = nig” — ci(n — iat ae ees cE (n = k)ia”—*. 


We need to verify that for b, = n'q", (bn) nen gives a solution of the recurrence 
Gn — C{An—1 — +++ — CkAn—p~ = 0. Let us plug 6b, for a, in the left hand side of 
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the equation: 


by — C10n-1 — +++ — Chbn—k 
nq? —a(n—1)'qg? 1 =+---a(n—k)ig?* = 
piz2(q) = 0. 


Here the last equality holds by (2). 


Theorem 3.23 (Existence and Uniqueness). Let f be a linear homogeneous 
recursion with constant coefficients and p(x) be its characteristic polynomial 
with roots qi,.-.-,Qr having multiplicities s1,...,5,, respectively. Then for any 
initial values, there is a solution (an)nen of f written in the form 


dn = (agp + ange +o tenet Tgp) +++ (age tenge +o ener), (3) 


where x’s are some constants. In particular, if all roots of the characteristic 
polynomial are distinct, dn = xq] +x*qy +--+ + xq). 


Proof. From Lemma 3.22, all the listed terms are solutions of the recursion 
with some initial values. From Lemma 3.20, any of their linear combinations is 
a solution of the recursion with some initial values. It remains to verify that 
there are coefficients (x’s) of the linear combination so that it is a solution of the 
recursion with the given initial values ao,a1,...,@%—1. Denote the consecutive 
coefficient in (3) in front of nJq? by 74,;,7=1,...,7, 7 =0,...,8;-—1. That is, 
the 7;,;’s are the unknowns solving equation (3) for all n € N, which can hence 
be written as 


r s;—-1 r s;—-1 
an =D Dd waira? =D | DL wan’ | a. 
i=1 j=0 i=1 \ j=0 
Recall that ao, a1,...,@,—1 are the known initial values. Plugging n = 0,n = 


1,...,.7=k-—1 we get: 


do = ("1,0 +04+-+-+0) +--+ + (70 +0+---+0) 
a = Hot wiate + %1a-Da+ 
oasis (Yr,0 air.) abe Yr,s,—1) Ur 
a2 = (Mot2nate:-+ Pea eee a Be 
(Yr,0 Sr Datel HB 
(4) 
sij—-1 k-1 
1 = (mot (R-Iyit---+(K-1)"y1,6,-1) a + 
ahaa (qr, gs caeeras (lea aaa eee qe 
The matrix of this system of linear equations with unknowns yj, 7 = 1,...,7, 
j=l,...,8; —1 is: 
1 0 0 0 1 0 0 0 
qd 71 71 q1 q2 q2 q2 q2 
G27 Wa? 237 qe 2q3 «2?q3 23.45 
@ 3q3 32g? 338q3 a 3q¢3 32g 3348 (5) 
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This is a square k x k matrix. We shall show that it is non-singular by proving 
that its rows r1,Y2,.-.,% are linearly independent. Assume that there are 
coefficients a1,...,Q%, not all equal to zero, so that ayry+aqerg+:--+azry, = 0. 
The components of this vector correspond to respective columns in the matrix 
in the following form: 


ay tae +asq7+---+arg¢) = 


l 


a,0 + aeq 4 a32q7 bees + ap(k — lat 
a0 + aeq +.0327q7 + +++ + ax(k —1)?qf7 


ay + a2gr +a3q7+-+-+angr 1 = 


I 


a0 + 2gr + a32g7 +++ + an (k — 1)gh7 


10 + agp +.0327G7 +-+-+an(k—1)?qh = 


Let h(x) = ay +agr%+a327...+a,x*—!. Then we see that q, is a root of h(x), as 
well as a root of ha(x) = xh (x), h3(x) = xh)(a),..., hs,. By Lemma 3.21 h(x) 
is divisible by (a —q,)*'. Similarly, h(x) is divisible by (a — qo)*?,..., (a —@,)*”. 
Thus h(x) is divisible by []j_,(x — qi)**. Thus the degree of h(x) is at least 
k = S>\_, si, a contradiction to the definition of h(x). Thus there are no such 
coefficients a;’s and thus our matrix is non-singular. Therefore the system (4) 


has a solution for any given right-hand side (ao, @1,...,@k—1). 


Theorem 3.23 gives us a tool to solve any linear homogeneous recursion with 
constant coefficients. The main steps are the following. 


1. Derive the characteristic polynomial p(z). 
2. Compute the roots q1,..., q- of p(x) with respective multiplicities s1,..., 5. 


3. Solve the system of linear equations (4) (for example by Gaussian Elimi- 
nation). 


4. Write down the explicit solution a, = )7;_, oe ijn’ qe. 
Let us illustrate this approach with a few examples. 
Example 3.24. We solve the recursion 
Qn = —5Gn-1 + 6an_-2, ao = 3,a, = 10. 


The characteristic polynomial is 2? + 5a — 6, its roots are q, = 1 and q = 
—6, both with multiplicity 1, i-e., s; = sg = 1. Thus the general solution is 
71,01” + y2,0(—6)”. Plugging initial values, we obtain 


ao = Y1,0 + Y2,0 = 3, 
a1 = 1,0 — 6Y2,0 = 10. 


Thus 2,9 = —1, ¥1,0 = 4, and the recursion is solved by (@,)nen with 


Gn = 4— (—6)”. 
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Example 3.25. We solve the recursion 
An = —44n—1 — 4Gn_2, ao = 3, a, = 10. 


The characteristic polynomial is 7? + 4a + 4, its roots are q, = —2, with mul- 
tiplicity 2, i.e., s; = 2. Thus the general solution is 71,0(—2)” + y1,1n(—2)”. 
Plugging initial values, we obtain 


ao = 71,0 = 3, 
a1 = 91,0(—2) + 41,1(—2) = 10. 


Thus 1,9 = 3, 71,1 = —8, and the recursion is solved by (@,)nen with 
Qn, = 3(—2)”" — 8n(—2)”. 
Example 3.26. We solve the recursion 
An = —44n_2, Go = 6,a, = 20. 


The characteristic polynomial is x? +4, its roots are gq, = 2i and q2 = —2i, both 
with multiplicity 1, ie., s] = sg = 1. Thus the general solution is 7,9(2i)” + 
2,0(—27)”. Plugging initial values, we obtain 


ao = Yi,0 + Y2,0 = 6, 
ay = 71,0(2%) + “¥2,0(—21) = 20. 


Thus 71,0 = 3—5i, 72,0 = 3+57. Therefore ay, = (3—52)(2i)" + (3452)(—27)” = 
(2i)"(3 — Bi + (—1)"3 + (—1)"5i). So, if n is odd, a, = (2i)"(—2- 5i), if n is 
even an = (2i)"6. Note that i” = e?!™4 = e™/? — cos(an/2) + isin(an/2), so 
for n = 4k, i" =1,n =4k+4+2, 7" = -1, forn = 4k +1, 2” =1, for n = 4k +3, 
St 


i 
Hence the recursion is solved by (a@n)nen with 


5-241 ifm =4k+1 
6-2”, ifn =4k +2 
—5-2"+1 ifm =4k +3 
6:2", ifn =4k +4. 


An = 


Recall from Example 3.17 that the Fibonacci sequence (F;,)nen solves the 
recursion 
Fy = Fa-1 + Fr-a, fo =1,F, =1. 


We shall now derive an explicit formula for F). 


Theorem 3.27 (Binet’s Formula). For the Fibonacci numbers we have: 


n= ((5) -(59) ) 


where ® := i+v5 is called the Golden Ratio. 


74 


Figure 22: In a drawing like this one, obtained by squares “spiraling” out of the 
center, the ratio of width and height will approach the golden ratio. Of course, 
you already know all about spirals, since you watched all Vi Hart Videos, right? 


Proof. The recurrence relation is F,42 = Fr+1 + Fy, with starting values Fo = 


0, F, = 1. So the characteristic polynomial is p(x) = x? — x — 1. The roots of 


p(x) are qi2 = 1tV5 i.e. 


14-75 1-75 
eo (98) (4) 


The general solution for F,, is therefore (by Theorem 3.23) 


15 \" 1-v5\" 
Fn = 71,0 5 + 2,0 5 


We need to find 71,0, 72,0 from initial values?. 


_.\0 0 
1+75 1-5 
0= Fo = 71,0 5 2,0 5 = 91,0 + 72,0: 
1 1 
tah 1-V5 
1= Ff, = 1,0 5 + 2,0 5 


The unique solution is 71,9 = Wet 72,0 = — 


1 
V5" 
In the explicit formula we have just shown, the term Gein = (—0.62)” 


quickly goes to zero. This allows us to ignore it when computing Fibonacci 
numbers: 


Corollary 3.28. F), is the integer closest to Re", ie. Fy = | + 3. 


a 


n 
Proof. It suffices to show that 1=y5 < s. 


ate 
V5| 2 


3 Any two starting values will do, we could determine 71,0; Y2,0 from F4 = 3, Fs = 5 if we 
wanted to. But using Fo and F\ yields the simplest system of equations. 
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1 
v5 


Example 3.29 (Ternary n-string without 20). In Example 3.18 we found start- 
ing values and the recurrence but we have yet to determine the values. Recall: 


J1-O”  fl1-o} ae ae 


i we “ys ° yays 5 2 


ty — 3, to = 8, tn+42 = 3tr41 — Lin 


34/5 
2° 


So p(x) = (a?—32+1). The characteristic polynomial has the roots q1.2 = 
So by Theorem 3.23 the general solution is 


t(n) = 1,0 () + Y2,0 (2 4) . 


Again we use the initial values to determine 71,9 and y2,9. For simplicity, 
use ty = 1 instead of tg = 8: 


1=t(0) = 71,0 + 72,0 


34+ V5 3-5 
3=t0)=%10- 1 
2 2 
Solving yields 71,9 = S+3V5 72,0 = 5-3V5 and therefore: 


. _ (543v8)\ (3+v8\" _ (5-3v5) (3- v5)" 
a 10 2 10 2 


3.3.2 Advancement Operator 


Remark. This section contains an alternative treatment of recurrence relations 
in terms of the so-called advancement operator. We present this material, even 
though it is repetitive and everything is already covered in Section 3.3.1 with 
the approach of the characteristic polynomial. 


We can consider any sequence (fn)nen as a function f : N > R where 

If f fulfills a linear recurrence relation with constant coefficients, then there 
is a unique way to extend f to Z such that the recurrence relation is still fulfilled. 
Take for instance the Fibonacci sequence: Given the starting values there is a 
unique way to calculate backwards using F'(n — 2) = F(n) — F(n — 1): 


n |... -5 -4 -3 -2 -1 0 1 2 3 
2 


4 
FG) |= © =3 2 = “0°21 3.5 


This makes f an element of the vector space of all functions from Z to R. 
We define the advancement operator A acting on this vector space. It maps a 
function f to the function Af with: 


(Af)(n) = f(r + 1). 
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An advancement operator polynomial is a polynomial in A, for example 3.4? — 
A+6. This is also an operator which maps f to the function (3A? — A + 6)f 
with: 


((3A? — A+6)f)(n) = 3f(n +2) — f(n+1)+6f(n). 


This notation allows us to write linear recurrence equations with constant coef- 
ficients as: 


P(A)f =9 


where p is some polynomial. For the Fibonacci numbers we would write (A? — 
A-1)F=0. 


In the following we try to solve equations of this kind. We start with the eas- 
iest case of a homogeneous equation and an advancement operator polynomial 
of degree one. 


Lemma 3.30. For a real number r 4 0 the solution to (A—r)f =0 with initial 
value f(0) =c is given by f(n) =cr”. 


Proof. 
(A—r)f(n) =0 f(n+1)—rf(n) =0 
© f(int+l=rfi(n 
= f(n)=r"- f(0)=r"-c 


Raising the difficulty a bit, now consider the advancement operator polyno- 
mial p(A) = A? +A—6 = (A+3)-(A-—2) and the corresponding homogeneous 
recurrence p(A)f = (A+ 3)(A — 2)f =0. Note that solutions of (A — 2)f =0 
or (A +3)f are also solutions of (A + 3)(A — 2)f =0. 4 

Actually, all solutions are sums of such solutions, which means that, using 
the last Lemma, we know that f is of the form 


f(n) = c1(—3)” + C22”. 


Here, c; and cz are constants that can be derived from two initial values for f. 
It is no coincidence that we found a two dimensional space of functions: 


Lemma 3.31. If p(A) has degree k and its constant term is non-zero, then the 
set of all solutions f to p(A)f = 0 is a k-dimensional subspace of functions from 
Z— C, parametrized by cy,...,ch which can be determined by k initial values 


for f. 


We do not give a formal proof. However, you should not be surprised by 
the statement. It is clear that set of all solutions form a subspace: If f,g are 
solutions and a, 8 € C\{0} then af +g is also a solution since p(A)(af+ 6g) = 
ap(A)f + Bp(A)g = a-0+8-0=0. It is also intuitive that there are k degrees 
of freedom: For every choice for values of f on {1,2,...,k} the recurrence gives 
a unique way to extend f (upwards and downwards) to other values. 

Generalizing our observation for the advancement operator polynomial (A+ 
3)(A — 2), we state (also without proof): 


4There is some low-level notational magic involved here: We need ((A + 3)-(A— 2))f = 
(A+3)((A—2)f) = (A—2)((A+3)f) where - is multiplication of polynomials and “nothing” 
is the application of an operator to a function. 
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Proposition 3.32. If p(A) = (A-—11)-(A-—1r2)-...:(A-—rx) for distinct 
T1,--+,1k, then all solutions of p(A)f =0 are of the form 


f(n) =arf tort +... + cerz. 


Every polynomial of degree k has k complex roots, but those roots are not 
necessarily distinct. The following Theorem handles the case of roots with 
multiplicity at least two and is therefore the last piece in the puzzle. 

From now on, instead of “all solutions are of the form...” we say the “general 
solution is ...”. 


Theorem 3.33. If r 4 0 and p(A) = (A—r)*, then the general solution of the 
homogeneous system p(A)f = 0 is 


f(n) =r" +n-cor™ + n2egr"4+...4+ nt-le, er, 


If p(A) = qi(A) - @2(A) and q(A),q2(A) have no root in common, then the 
general solution of p(A)f = 0 is the sum of the general solutions of qi(A)f = 0 
and qo(A)f =0. 


Example. Let p(A) = (A —1)°- (4 +1)?-(A—3). Then the general solution 
to p(A)f =0 is 


fn) =a tn-ceQ¢n*-cg tn? +cat n*-cs +c6(—1)” +0: c7(-1)” + cg3”. 


Note that the condition that r (the constant term of p(A)) may not be zero 
is no real restriction: An advancement operator polynomial of the form p(A)-A 
describes the same recurrence as p(A), just at a different index. 


Theorem 3.34 (Binet’s Formula). For the Fibonacci numbers we have: 


n-((S*) -(S*)) 


where ® := 1+v5 is called the Golden Ratio. 


Proof. The recurrence relation is F,42 = Fr41 + F, with starting values Fo = 
0, F, = 1. Rewrite this in terms of advancement operator polynomial: 


f(n+2)— f(nt+1)— f(n) =0 = (A? -— A-1) f(n) =0. 
a ey 


~ 
p(A) 


The roots of p(A) are 1EVe i.e. 


0-8) (2) 


The general solution for f(n) is therefore (by Proposition 3.32) 
14+¥5\" 1-¥5\" 
f(n) =a 5} + CQ 5 ; 
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Figure 23: In a drawing like this one, obtained by squares “spiraling” out of the 
center, the ratio of width and height will approach the golden ratio. Of course, 
you already know all about spirals, since you watched all Vi Hart Videos, right? 


We need to find cy, c2 from initial values”. 


0 0 
0= f(0)=a (3) + C2 C2") =cCi + C2. 


L=f(l)=e1 (: 4) ie (: 4) 


The unique solution is cj = Wet Q=- 


=a 
mee 
In the explicit formula we have just shown, the term GS = (—0.62)” 


quickly goes to zero. This allows us to ignore it when computing Fibonacci 
numbers: 


Corollary 3.35. F,, is the integer closest to -®", i.e. Fy, = ka + 3. 


V5 
Proof. It suffices to show that # j4| < 4. 
1 j1—¥V5 = |L=GF sli=S]'_ 2 2 ee 


ve| 2 Jo Ve GS vovs 5 2 


Example 3.36 (Ternary n-string without 20). In Example 3.18 we found start- 
ing values and the recurrence but we have yet to determine the values. Recall: 


t1=3, to =8, tho2 = 3tr441 — ty. 
So p(A)t = (A? —3A+1)t = 0. The advancement operator polynomial has the 


, B4V5 
2 


. So by Proposition 3.32 the general solution is 


n n 
=e 3475 re ces V5 
= 1 2, . 
2 2 
5 Any two starting values will do, we could determine ci,c2 from F4 = 3,F5 = 5 if we 
wanted to. But using Fo and F{ yields the simplest system of equations. 
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Again we use the initial values to determine c; and cz. For simplicity, use 
to = 1 instead of tg = 8: 


1=t(0) =c, +c 


3475 3-75 
: + c2 


3=t0)=a 5) 5) 


Solving yields cy = =. C2 = 5-3v5 and therefore: 


_ (543v5\ (34+v5\"_ (5-3v5\ (3-v5\" 
ag 2 0 2 


3.3.3. Non-homogeneous Recurrences 


So far we ignored non-homogeneous linear recurrence relations, i.e. those of the 
form 
Gn = C1(N)dn—1 +--+ + eK (N)an—x + g(n) 


for some non-trivial function g: N— R. 


Lemma 3.37. Any two solutions (an)nen; (a), )nen of a non-homogeneous linear 


recurrence Gn = f(Qn—1,---,@n—k) + g(n) “differ by” a solution of the corre- 
sponding homogeneous recurrence On = f(Gn—1,---,;@n—k), te. there is (a* nen 
such that a, =a* +a}, for alln © N with a}, = f(at_,,...,a%_,) for alin > k. 


Proof. Just set a* = a, — a}, for alln € N. Then for all n > k we have 


as desired. 


This means that to find all solutions of a, = f(dn—1,---,;@n—z~) +g(n) it suffices 
to find a single such solution (a/,)nen, the particular solution, and all solutions 
(a*)nen Of Gn = f(Qn—1,---,Qn—z), the general solution. 

Then the general solution of the non-homogeneous recurrence is given by 
An = ar +a), forneN. 

Since there is no framework that always works to find particular solutions, 
this is actually the difficult part. 


Example 3.38. Recall from Example 1.6 (and the corresponding problem on 
the exercise sheet) that when cutting a two dimensional cake into the maximum 
number of pieces, the n-th cut yields n additional pieces, i.e. 


S8n =Sn-1+7, So=1 andtherefore s, =c(n)sn—1 + 9(n) 


for c1(n) = 1 and g(n) = n for n EN. 
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1. Find general solution for the homogeneous recurrence: 


Sn = Sn-1 > p(x) =a#-1lsSMqm=1,5,=1 
>a, = 1,097 = Y1,0- 
2. Find particular solution for non-homogeneous recurrence: S$, = Sn—1 +N. 
We guess that the solution is “similar” to the right hand side. Since the 


right hand side (i.e. g(n) = n) is a polynomial of degree 1 over n, we guess 
that the solution could be a polynomial of degree 2 over n meaning 


ai, = dyn? + don + dg 


where dj, dz, d3 are to be determined. We plug this guess into the desired 
recurrence: 


din? + dgn+d3 =al,=a',_, t+n=d(n—1)?+do(n—1)+dg4+n 
= dyn? + don + ds + (—2din + dy — dz +n) 


which has the solution dj = dz = $ (note that we “lost” dg on the way, 
so it is not needed and we just set it to zero). So we found one particular 


solution: 
ind Dats 1 =n ("3") 
T 9) . 


n= 
2 2 
3. The general solution for the non-homogeneous recurrence is the sum 


“ n+l 
sn =a +a, = n10 + ( 5 ) 


We still need to determine 7,9 using the initial value: 
l=so=N0+05> V0 =1 
Bache a5 ON ees fe ath Oy ae 
RN me 1 0} 
Example 3.39. We solve the recurrence: 
Gn, = 40n_1 — 44n_2 + 3" + 2n 
1. General solution for homogeneous recurrence: 
Gn = 4an—1 — 4An—2 > p(x) =7?—-4r+4= (x — 2)? =>q = 2,5, =2 
> a, = N00 + 11,17g7 = 71,02" +71,172”. 
2. Particular solution of the non-homogeneous recurrence: 
an = 4an—1 = 4an—2 + 3” + 2n 
Guess something similar to the right hand side. Our attempt is®: 


ai, = d, +3" + dgn + d3. 


6We could have guessed another summand of d4-n?, but as it turns out, it is not needed. 
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Which means we need to find d;, dz, d3 such that: 


d,-3" + dgn+d3 =a', = 4an—1 — 4an—2 + 3" + 2n 
= 4d, -3"~! + 4do(n — 1) + 4d3 
— 4d, -3"~? — 4do(n — 2) — 4dg +3" + 2n 
= 8d, -3"~? + 4dp + 3" + 2n 
=d,-3" +don+d3 
+ (—dy- 3"? — don — d3 + 4dg+ 3% + 2n) 
=d,-3" +don+d3 
+ ((9 — d,)3"~? + (2 — dz)n — dg + 4dz) 


and we get d, = 9, dy = 2, d3 = 8, Le. 
= 8"t2 4 on + 8. 


3. The general solution of non-homogeneous recurrence is therefore 
Qn =a, +4, = 71,02" + 71,102” + Be bs ap des 
Starting values would now allow us to determine 71,9 and 71,1 but we omit 
this here. 
3.3.4 Solving Recurrences using Generating Functions 


We now examine one example of a recurrence relation with non-constant coef- 
ficients. 


Example 3.40 (Derangements yet again). Recall from Theorem 1.24 that for 
dn = |D,|, the number of derangements on [n], we have 


do = ales dy = 0, dy = 1, dn = (n _ 1)(dn-1 + dn—2) (for nm = 2). 


This constitutes a homogeneous linear recurrence relation with / = 2 and coef- 
ficients cy(n) = ceo(n) =n-1,9g=0. 
Let D(x) be the exponential generating function for (dp)nen, ie. 


=a 
n 
n=0 
Then using Observation 3.13 and the recurrence we find: 


Sos zn yh 
= an or at r+ Donat ia 
-1 


= 2D! Ya) -- ee 


This differential equation has (given d; = 0, dz = 1) the unique solution: 


x 


Luckily, we already know the exponential generation functions for e~” and for 


+ so we can write: 


D(a) =e". — = (eos : (sw) . 


n=0 n=0 


aa 3 (>: ({)acu) = 


So we found yet another proof that d, = > (7)k\(-1)"-*. 
k=0 


Let us give two more examples showing how we can solve recurrences using 
generating functions. 


Example 3.41. Consider the recurrence 
Qn = —An-1 +6n_2 (for n > 2), a9 =1, a, =3. 
With the characteristic polynomial p(z) = x? + x — 6 in mind we find the 


following equation for the generating function F(x) = 07° 9 an2” of (Gn)nen- 


Co [oe) [oe) 
F(a)(1+2— 627) = S- Anz” + S- An—12" — 6 Ps An 2X” 
n=0 n=1 n=2 
= ax? +ayz' + aor =1+4 42. 


Thanks to our approach and using the recurrence for n > 2, all but a finite 
number of terms canceled out. In the last step we used the initial values. 

Using partial fraction decomposition (not handled here) we can simplify the 
identity and obtain: 


Oe T+4¢. 1+ 4¢ _ 6 1 ;. 1 
T+ e—6e2 (1—2x)(1+32) © 1—-2e ° 1432 
6 = n 1 = nm 
= 2 S20)" — by (22) 
n=0 n=0 


where in the last step we applied at. = My for = 2x and 1 fee 3a. e 
1-y n=0 Y y W 
can now see the coefficients: 


On = $-2" — 2. (—3)". 
Example 3.42. Consider the following non-homogeneous recurrence: 
bn = bn—1 + 2bn—2 + 2” (for n > 2), bo = 2, bi = 


Multiplying the equivalent statement 6, = by—1 — 2b,-2 = 2” with x” for each 
n > 2 and adding yields a sum in which we can rewrite the terms: 


Co 


oo oo oo 
o bnw” S- bn—12” —2 S- bn—2v”" = x Wa”. 
n=2 n=2 n=2 n=2 

—S —’ —$ —"’ —> —"’ 
F(x)—bo—2xby x-(F(x)—bo) «2 F(x) oes — 1-22 
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Solving for F(x) yields: 


F(2)(1 — a2 — 227) = +1-32 


1— 22 
2—5a + 6x? 
(1 — 2x)(1 — a — 22?) 


=> F(xr)= 


From here, apply partial fraction decomposition with a method of your choice, 
to get 

i_l 1 4B 1 

91-22 (1-22)? %1+a° 


From this we can obtain the coefficients again using identities we know. 


co co 2 co 
F(x)= —% Qa” + 2 » ine) + 2 LV x 


F(x) = +3 


n=0 n=0 
co 

=-§ y 2a" + 2 ( (n+ 1) "ar + 3 Se re 
n=0 n=0 n=0 
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4 Partitions 


4.1 Partitioning [n] — the set on n elements 


A partition of [n] into given by its parts Ai,..., A, with 


k 
Ai: #0 (fori=1,...k), A;N A; =0 (fort #j), LJ Ai 


@ 


Figure 24: A partition of [9] into 4 sets. 


The parts are unlabeled, i.e. if two partitions use the same parts A; only 
in a different order, we consider them to be identical. If we fix the number of 
parts, i.e. if we want to count partitions of [n] into exactly k non-empty parts, 
then their number is given by s//(n) as we have already seen in Section 1.5.2 
where we considered arrangements of n labeled balls in k unlabeled boxes with 
at least one ball per box. 

We define the Bell Number as 


By, = x sil (n). 
k=0 


It counts the total number of partitions of [n] (into an arbitrary number of sets). 
Don’t get confused over the special case of n = 0: There is exactly one partition 


of @ into non-empty parts: @ = U,e9 A. Every A € @ is non-empty, since no 


such A exists. So we also have Bo = s4/(0) =1. 


A different way to define the Bell numbers is to consider a square free number 
k € N, meaning 


k=p.-p2-...:pn for distinct primes p),...,pn- 


The number of ways to write k as product of integers bigger than one is exactly 
B,,. Take for instance k = 2-3-5-7, then some ways of writing k would be: 


k=6-35, k=2-5-21, k=210 
which directly corresponds to the partitions: 
{2, 3,5, 7} = {2, 3}U{5, 7}, {2,3, 5, 7} = {2;U{5}UL3, 7}, {2,3,5, 7} = {2, 3, 5, 7}. 


In the following we try to find an explicit formula for B,. We start by finding 
a recursion: 
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Theorem 4.1. 


Proof. Every partition of [n] has one part that contains the number n. In ad- 
dition to n this part contains k other numbers (for some 0 < k <n-—1). The 
remaining n — 1 — k elements are partitioned arbitrarily. From this correspon- 
dence we obtain the desired identity: 


n—1 


Be Ue) ee Daa) ee 


k=0 


Proof. Consider the exponential generating function B(«) of (Bn)nen. 


co Fal 
Bais Se By 
n=0 ; 


Using the recursion from Theorem 4.1 we see 


n=0 n=0 
= B(z)=e* 1 (using Bo = B(0) = 1). 
ee De OPP ee oe GR) 
1 ue = 
€ “gee k! ~ ey n! 
3 (ihe) 2 
~ e€ kl } nl 
n=0 k=0 


4.1.1 Non-Crossing Partitions 


Imagine the elements of [n] to be laid out in a circular way. 

Two disjoint sets A,B C [n] are crossing if there are numbers i < j < k < 
1 € [n] such that {i,k} C A, {y,1} C B. 

A non-crossing partition is a partition in which the parts are pairwise non- 
crossing. In cyclic drawings the notion is very intuitive. See Figure 25 for 
examples. 

We denote the number of non-crossing partitions of [n] by NC,,. We prove 
now that NC, is equal to Cy, the n-th Catalan number. We already came 
across these numbers in Example 3.5, where we counted well formed parenthesis 
expressions. 
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Figure 25: A non-crossing partition of [9] on the left and a crossing partition of 


[9] on the right. 


Theorem 4.3. 


1 2 
NO: Se (*"). 
n+1l\n 


Proof. Recall the values Cp = 1,C, = 1,C2 = 2 (corresponding to the paren- 
thesis expressions “”, “()”, “(\Q”, “((Q))”) and the recursion 


Crit = >) CeCn—t. 


k=0 


It suffices to prove that the sequence (NC,,)nen has the same starting values 
and satisfies the same recursion. 
It is easy to check that NCp = 1,NC; = 1, NCz = 2. Actually those numbers 
are just the Bell numbers since partitions of [n] can only be crossing for n > 4. 
We now have to prove the recursion 


NCnqi = x NC. eNG. on 
k=0 


To this end, consider any non-crossing partition P of [n + 1]. The last element 
n+ 1 isin some part S C [n+ 1]. Let k be the biggest number in S$ other than 
n+ 1 if such an element exists and k = 0 if S = {n+ 1}. Now observe that in 
the partition P, every part contains either only numbers that are bigger than k 
or only numbers that are at most k: Otherwise, such a part would cross S. 
This means that P decomposes into a non-crossing partition of [k] and a 
non-crossing partition of {k + 1,...,n}. Here we ignored n+ 1: It must be in 
the same part as k and will never produce a crossing if there has not already 
been one. Such a decomposition is unique: Every non-crossing partition of 
[2 + 1] uniquely decomposes and every pair of non-crossing partitions of [k] and 
{k+1,...,n} corresponds to a non-crossing partition of [n + 1]. 
This proves the claimed recursion and therefore the Theorem. 


4.2 Partitioning n — the natural number 


We are interested in ways to write n as the sum of positive natural numbers. 
We would say, for instance, that 


n=17=54+544+3. 
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is a partition of n = 17 into the (unlabeled) parts 5,5,4 and 3. 

Alternatively we write n = = (5,5,4,3) and say X is the partition of n 
(even though \ is a sorted tuple) hoping this will not be confusing. 

We already considered this in Section 1.5.4 were we counted arrangements 
of n unlabeled balls in k unlabeled boxes and at least one ball per box. The 
number of such arrangements is given by p,(n). We established a recursive 
formula 


0 ifk>n, 
agp? fn RO, 
nr 

EPO NG ifn=k=0, 


pe(n—k)+ppi(n—1) if l<k<n. 


We define the total number of partitions of n (into an arbitrary number of parts) 
as the partition function 


p(n) = >> pe(n). 
k=0 


We illustrate a partition by its Ferrer diagram (also called Young diagram). 
It consists of rows of squares (left aligned) corresponding to the parts (largest 
part in the topmost row). Consider for instance the partition A = (5,5, 4,3) of 
n = 17 which has the Ferrer diagram 


Observation 4.4. The number of partitions of n into at most k parts is equal 
to the number of partitions of n with parts of size at most k. 


Proof. The Ferrer diagrams for partitions with at most k parts are those with 
at most k rows. The Ferrer diagrams for partitions with parts of size at most 
k are those with at most k columns. Clearly there is a bijection between those 
sets of diagrams: Flip them along the diagonal! 


For instance, flipping yields 


Which means the diagram for the partition \ = (7,5, 3,3, 3,2) is mapped to the 
diagram for the partition A = (6,6,5,2,2,1,1). 


Formal Proof. The bijection on the diagrams corresponds to a bijection on the 
partitions that maps the partition n = (Ai,...,Ax) with biggest part / to the 
conjugate partition n = (Aj,..., Aj) defined as 


A= HI | Ag 2 eI. 


Theorem 4.5 (Euler). 


S¢ p(n)” = II 1 + 
n=0 k=1 


Proof. We boldly rewrite the infinite product as an infinite product of infinite 
sums using the identity: 7 a en a 


Il I od 1 1 
l-a 1-2 1-2? 1—23 
k=1 


Eee") 


In an expansion of this, the coefficient in front of x” is just the number of ways 
we can choose 1, 2,73,... such that }),.)7-ni =n. But such choices exactly 
correspond to partitions of n, i.e. the coefficient is p(n). 

Take for instance our favorite partition n = 17 = 5+5+4+3. It corresponds 
to choosing n5 = 2, ng = 1, n3 = 1 and all other indices n; as 0. This proves 
the claim. 


I 


Define poaa(n) as the number of partitions of n into odd parts and paist (7) 
as the number of partitions of n into distinct parts. 
Take for instance n = 7. We have poaa(7) = 5 since 


7= 1414141414141, 7=14141414+38, 7=14384+3, 7=1414+5, 7=7, 
and paist(7) = 5 since 
Pa Sed Yateoed PSOee. VST Ae. poy 


This is no coincidence, in fact, in the problem class you already saw: 


S— Poaa(n)x” = II — = [[a +2*) = S- paist(n)2”. 
n=0 k=0 k=0 n=0 


This proves Poaq = Paist, but we will prove it yet again, this time by constructing 
a bijection. 


Theorem 4.6. Doaa = Dadist- 


Proof. Let n = A14 + Ag+...+Ax be a partition of n into distinct parts. We 
separate the powers of 2 from the ;, i.e. we write: 


m= uy2™ + ug2 +... + up2” 


for odd numbers u; and A; = u;2%. Note that the u; need no longer be distinct, 
for example if Ay = 5,A2 = 10 we would have u; = ug = 5. We sort the 
summands according to the u; which gives 


Pi AO OP POY eg Oe OP? co OP a. GS) 
————————S eS 
TL i) 


where the values ju; are all distinct and take the roles of u;, in particular, the 
following multisets coincide 


{tis teas oR} — Rineg i rears nee) tay ceeeye reed g 
Ss... esa” 
1, times lz times 
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Note that the values r; are sums of distinct powers of 2. 
For the bijection, we map the original partition into distinct parts to the 
following partition into odd parts 


N= fy thr... + Mart bet fat... + bat... 
OT 


ry times rg times 


We still need to argue that this constitutes a bijection. To do so, we explain 
the inverse mapping. 

Given a partition into odd numbers with repetition numbers r;, there is a 
unique way to write these r; as sums of distinct powers of 2, the binary expansion 
of r;. This gets us to the situation («) and from there we get back to a partition 
into distinct parts by multiplying out. 


Example. To illustrate the last Theorem, take the partition into this distinct 
parts: 
26=12+6+4+4+3+1. 


separate the powers of two and sort the terms according to the odd numbers 
that remain: 


26 = 3-2? +3-9'+41-.2? 43.29 41.29 
= 8(2° +27 +27) + 1(2° +2?) 
=3-7+1-5 
which yields the partition 
26=3+343438434343+141+141+1, 


All steps are reversible, as there is only one way to write 5 and 7 as sums of 
distinct powers of two. In terms of Ferrer diagrams we have mapped: 


Now define p§’"(n) and p9*4(n) as the numbers of partitions of n into an 


even number of distinct parts and an odd number of distinct parts, respectively. 
We have, for instance 


pa" (7) = #{“1+6", “245”, “344” } = 3, 
p324(7) = #1 142-4", rie —2. 


Apparently, these numbers can differ, but we now show that they can differ 
by at most 1, and characterize when this is the case. 


To this end, define for k € Z the pentagonal number wz = 
(Bk+))k 
2 


(3k—1)k 


7: Note 


that with this definition w_, = ( . Some values are: 
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Lemma 4.7. 
pe"er(n) — pi!*(n) = 


(-1)* ifn=wz, for somek € Z, 


0 otherwise. 


Proof. Consider the Ferrer Diagrams for a partition into distinct parts. In it, 
two rows may never have the same length. Define the slope S of such a diagram 
to be a maximal staircase going diagonally down starting from the top-right 
square and define the bottom B as the last row of the diagram. The following 
diagram has a slope of length 3 (highlighted in red) and a bottom of size 2 


(highlighted in blue): 
aa 


Note that, in a few special diagrams, B and S may have a single square in 
common. We define the set A :-= BMS, it contains the single common square 
if it exists, and is empty otherwise. We will be sloppy in notation using B, S$ 
and A to simultaneously denote the set and the size of the set. 

Now distinguish three types partitions corresponding to three types of dia- 
grams where: 


e Typel: S>B+A 

e Type2: S<B-A 

e Type3: B-A<S<B+A 
Note that the latter case can only occur for A= 1 and S € {B, B— 1}. 
Claim. (i) Type 3 partitions can only occur if n = wz for some k € Z. 


(ii) Conversely, if n = wz for some k € Z, then there is exactly one Type 3 
partition of n. 


Proof of (i). Consider the case k := S = B—1 first, where the diagram looks 
like this (for & = 4): 


which gives 


5 


ey (k+1)k  (8k+1)k 


The other case is k := S = B meaning the diagram looks like this (for k = 4) 
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which gives 


3k+1)k 2k 3k —1)k 
n=W_E xe — | a 


Proof of (ii). The existence is already clear from our proof of (i): We found 
for arbitrary k € N a type 3 partition of w_, and wx. For the uniqueness, note 
that no two diagrams of type 3 have the same size: We can iterate through all 
of them by alternatingly adding a column and a row. 

Next we prove the following claim: 


Claim (iii). For every k € N there is a bijection between Type 1 partitions on 
k rows and Type 2 partitions on k — 1 rows. 


Note that from this the Theorem follows, since it guarantees a bijection that 
maps every partition to a partition with one row more or one row less, so every 
partition into an even number of parts is mapped to a partition into an odd 
number of parts and vice versa. This shows that the number of even and odd 
partitions must coincide. The only disturbance can be the single type 3 partition 
that exists if n = wz and is even if k is even and odd if k is odd. 


Proof of (iii). Consider a Type 1 partition, its slope is at least as large as its 
bottom, maybe like this: 


(i 


We take away the bottom and distribute the squares among the first |B| rows 
of the diagram, like this: 


There is enough room to do this (since the slope was at least as big as the 
bottom) and the resulting partition is a partition into distinct parts. The size 
of the bottom has increased and the size of the new slope is the size of the old 
bottom. Therefore the new diagram is of type 2 or type 3 and, looking more 
closely, type 3 can actually not occur as result of our operation, so it really is of 
type 2. The inverse operation is to take the current slope and create a new row 
from it. After checking that this maps type 2 partitions to type 1 partitions we 
are done. 


Theorem 4.8 (Euler’s Pentagonal Number Theorem). 


CO 


[[@-25) = So Ct)fa™= 14+ So (-1)F(2"* +2"-*), 
k=1 


k=1 k=—oo 
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Proof. Multiply out the left hand side. The coefficient for x” counts the number 
of partitions of n into distinct parts, however, partitions into an odd number of 
parts are counted with negative sign. Therefore 


[[@ = 2") = Serer) — patna MET ST (- Ra. 
k=1 n=0 k=—0o 


We have yet to explain why w, are called pentagonal numbers. It is basically 
analogous to square numbers and triangle numbers: All occur naturally when 
filling a two-dimensional region with dots. 


We start with one dot, and say it is “layer 0”, then add layer by layer filling 
an area of the respective kind. For pentagonal numbers, there are four dots in 
layer 1, seven dots in layer 2 and so on. Generally, in layer 7 there are 3i + 1 
dots since there are three sides with 7+ 1 dots each, but two dots are shared by 
two sides. In a drawing with k layers there are: 


k-1 k-1 
; k\ (3k —1)k 
S38 +1 =k 435 i) = b+ 3(5) = OME iy, 


1=0 i= 


4.3. Young Tableau 


A Young tableau T is a Ferrer diagram of a partition \ of n that is filled with 
all numbers 1,...,n. For instance if A = (7,6,5,3,2,2,1) is a partition of 26: 


11}26|23)13)19)14)24 
9 |15)22)21 
5/1|2/3)|6 
7 


We say T is a tableau of X or T has shape X. There are n! tableaux of shape 4. 


Definition 4.9. A standard Young tableau is a Young tableau where the num- 
bers in the rows are increasing from left-to-right and the numbers in the columns 
are increasing from top-to-bottom, take for instance the following standard 
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Young tableau: 


1} 2 | 3 | 6 )12)16)25 
4} 7] 8 |17/18)21 
5 |11)13)19)26 
9 


From now on we only consider standard Young tableaux. 


Theorem 4.10 (Robinson-Schensted-Correspondence). There is a bijection be- 
tween the permutations of [n] and the set 


U {(Z1, T2) | Ti and Tz are standard Young tableaux of shape X}. 


d is partition of n 


i.e. the set of ordered pairs (T,,T2) where T; and T> are standard Young tableauz 
of the same shape. 


Proof. We will see a geometric construction that builds, given a permutation 7, 
a corresponding pair of standard Young tableaux. 

Let 7 be a permutation of [n] and X(7) = {(i, 7(4)) |i =1,...,n} the corre- 
sponding point set in the plane. For instance, for the permutation 7 = 3271546 
of [7] the point set X (7) is: 


1(x) 


n 


PNMNwWHOLDN 
° 


> x 
1234567 


A point p is minimal if there is no other point that is both to the left and below 
of p. The set of all minimal points in X is therefore 


min(X) = {(2,y) € X | V(a’,y') € X \{az,y}: 2’ > wory’ > y}. 


The shadowline S(X) for a point set X is the weakly decreasing rectilinear 
line through all points in min(X) and convex bends exactly in min(X). Here, 
by convex we mean L_ and by concave we mean |. In the following drawing, 
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min(X) (red) and S(X) (black) are shown. 


ee rT 


: 
poo 


1234567 


In case you like pseudo-code, a concise algorithmic description of our con- 
struction is given below. It does not yet provide the pair of tableaux we want, 
but all the auxiliary objects we require to define them. A detailed description 
in words follows. 


Algorithm 1 Geometric Variant of Robinson-Schensted-Correspondence 
Xi - X (1) 
i+0 
while Xi4i # 0 do 
~-—tt+l 
j<O0 
while X/; 4 0 do 
vee ae 
Sf + 8(X;) 
Xj} & X!\ min( Xj) 
end while 
Xi41 < convex bends of S},...,.S97" 
end while 
mt 


The algorithm proceeds in phases (counted by the variable 7), the total 
number of phases m is not known beforehand (but bounded by n). We start 
every phase 2 with a non-empty point set X;. For X; we construct a sequence 
of shadowlines S},...,97* where the j-th shadowline is taken for the point set 
consisting of those points from X; that were not used for previous shadowlines. 


S} =S(X;), S?=S(X;\ St). In general: S? = S(X; \UZ2, S*). 


The i-th phase ends as soon as all points from X; were contained in one of the 
shadowlines S},5?,...,.97". 

The shadowlines of phase i determine the i-th row of the tableaux T; and 
T2 we want to construct. Let x} be the z-coordinate of first segment of the 
shadowline S) , Le. the x-coordinate at which Si leaves the picture on the top 
and y! the y-coordinate of last segment of o , i.e. the y-coordinate at which S$? 
leaves the picture on the right. Then the 7-th row of T, and T> consists of the 


numbers x},...,27"' and y},...,y/"", respectively. 
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Shadowlines may contain concave bends (they do iff they have two or more 
convex bends). The set X;+1 is defined as the set of all concave bends occurring 


in the shadowlines 


1 Ni 
S1,...,5. 


serves as point set for phase 7+ 1. 


If X;41 is empty, we are done, otherwise, it 
(proof continues later) 


Before we verify that the construction yields the bijection we desire, we give 


an example. 


Example. Consider again 7 = 3271546. The first phase of the algorithm will 
find three shadowlines S$}, $7, 57. They leave the diagram on x positions 1, 3 
and 7 and on y positions 1, 4 and 6, giving rise to the partial tableaux as shown. 


[2] 


e—_________ 


1234567 


> @ 


There are four concave bends on these shadowlines which give the point set 
In phase 2 we get two 


for the next phase: 


X2 = {(2, 3), (A, 2), (5, 7), (6, 5)}. 
shadowlines and add corresponding second lines to the tableaux. 


There are still convex bends, so we proceed with phase 3 


1234567 


No concave bends were made this time, our construction is done. 


1234567 


(T\, Tz) is the result. 
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The pair 


We now establish, in a series of claims, that the construction constitutes 
a bijection between permutations and pairs of Young tableaux of the same 
shape as desired. We use the notion of a chain, which is a subset Y of a 
point set X such that Y is increasing, i.e. Y = {(x1, y1), (%2,Y2),--+5 (Lk, ya) } 
with yy < yg <...< yp and 21 < 4% <<... < &. 


Claim. The number n; of shadows lines in phase 7 is the length of the longest 
chain in X;. 


Proof of Claim: Let Y be a longest chain in X;. No shadowline can contain more 
than one point of Y, since shadowlines are decreasing while Y is increasing. This 
shows n; > |Y |. 

Now observe that for the element js © Y with smallest x- and y-coordinate 
we have « € min(X;), otherwise there would be a point to the left and below 
of u enabling us to extend Y. Therefore, the first shadowline $} will contain pu 
(and, indeed, an element of every longest chain) so the longest chain in X; \ S$ 
will be of size |Y|—1. The next shadowline will contain another point of Y and, 
by induction, we conclude that n; < |Y| since X,4)y; will only contain chains 
of size 0, meaning X1+)y| is empty. (claim) 


Claim. The tableaux have valid shape, i.e. nj41 <n, for alla =1,...,m-—1. 


Proof of Claim: Let Y be a longest chain in X;41, consisting of some positions 
of concave bends of the shadowlines from the previous phase 7. Since the concave 
bends of every shadowline are decreasing, Y can only contain one concave bend 
from each shadowline of phase 7. This means |Y| < n;. We already know 
|Y | = 441 from the previous claim, so we are done. (claim) 


Claim. The rows of T, and T> are increasing. 


Proof of Claim: This is equivalent to saying: Shadowlines constructed later in 
the phase will leave the picture further to the right and further to the top. This 
is obvious by construction: Every shadowline goes through the unused point 
with smallest x coordinate and the unused point with smallest y coordinate, 
making that point unavailable for later shadowlines. (claim) 


Claim. The columns of JT, and T) are increasing. 


Proof of Claim: We consider two consecutive phases 7 and 7+ 1 and show that 
in the step from row 7 to row i+ 1 every column is increasing. 

In phase i the shadowlines S},...S7* where constructed with corresponding 
sets of concave bends B},..., B?' (some of which may be empty) that together 
form the set X;,1. In the following we implicitly use that shadow lines of 
the same phase are non-crossing. We have B} C min(X;41), meaning that 
the points B} will be “consumed” by the first shadowline Sj, of phase i + 1. 
Therefore S?,, will be constructed from a subset of B?,..., B7’, and will use up 
all remaining points from B? (if any remain). Generally, S? 1 Will be constructed 
from a subset of BY,...,B?'. The x-coordinate of the leftmost point of ae 
is therefore at least the x-coordinate of the leftmost concave bend of si and 
therefore to the right of the leftmost point of si . This proves that columns of 
T, are increasing, for T; do the same argument for y coordinates. (claim) 
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So far we proved that our map is well-defined, i.e. it gives rise to a pair of 
standard Young tableaux for any permutation a € S,,. The final step is to show 
that the map constitutes a bijection. 


Claim. There is an inverse map, i.e. from any pair (71,72) of standard Young 
tableaux of the same shape we can recover a corresponding permutation 7 € Si. 


Proof. We demonstrate the inverse procedure with an example, hoping the gen- 
eral case will be apparent from this. Consider the following pair of Young 
tableaux: 


2/4/8 
6 


3/6] 7 
4 


Ty = T> = 


NI} OU) Co] Fe 


[oofex[rofe 


These tableaux contain eight numbers in four rows, so we try to recover a 
four-phased construction and annotate the x and y coordinates of an 8 x 8 grid 
with the index of the phases at which we wish a corresponding shadowlines to 
leave the picture. 


mt)41213241 


PNW KH OLD N © 


RRFNFNWHE A 


12345678 


We claim there is a unique way to draw the shadowlines of phases 4,3,2 and 1 
(one after the other) that correspond to a forward construction (see Figure 26). 

First connect the indices of the last phase with L-shaped shadowlines (the 
lines of the last phase do not have convex bends). In our example, there is only 
one such line (drawn in red). Now draw the shadowlines for phase 3, again 
there is only one in our example. It must connect the corresponding numbers 
at the border of the picture and have a concave bend at the convex bend of the 
shadowline of phase 4 on the way, there is only one way to do it. The lines for 
phase 2 (shown in blue) are added one after the other (right-most first). The 
first line must visit the convex bend at (7,5) if it didn’t, then this point would 
not be reachable for the next blue shadowline (since they must not cross). In 
the last step the shadowlines of phase 1 are added (yellow). Convince yourself 
that there is, again, only one way to draw them: Every yellow shadowline must 
visit exactly those convex bends of the blue lines that where not visited by a 
previous yellow shadowline and that would otherwise become unreachable for 
subsequent yellow shadowlines. 

Now the permutation can be obtained by taking the convex bends of the 
shadowlines of phase 1. It is: 7 = 28561437. 

With techniques similar to what we have already seen one can show that 
this backwards procedure works for arbitrary tableaux. 
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Figure 26: Reverse construction of the Robinson-Schensted-Correspondence. 


4.3.1 Counting Tableaux 


Given the geometric variant of the Robinson-Schensted-Correspondence (the 
original construction, although equivalent, was formulated very differently), it 
is not hard to count all Young tableaux with n elements. 

We say a permutation 7 € S;, is an involution if r? = id, ie. applying 
nm twice yields the identity permutation. This is equivalent to saying that the 
disjoint cycle decomposition of 7 contains only cycles of size 2 (transpositions) 
and cycles of size 1 (fixed points). For instance: 


az 12 3 4 5 6 7 


ri: 1234 5 6 7 


We would write this involution as 7 = (1 4)(2 6)(7 5). 


Lemma 4.11. Let 1 2° (T,, To), i.e. a is mapped to (T,,T2) via the Robinson- 
Schensted-Correspondence. Then 


Aya ES a 


(ii) m is an involution iff T, = To. 


(itt) The length of the longest increasing subsequence in m is the length of the 
first row in Ty. 


Proof. (i) Recall that the construction starts with 


X(m) = {(é,r()) | 4 € In} = {9 *(9),. 3) | 9 € Inf. 


The point set X(z~+) is therefore obtained by flipping the point set X (7) 
along the diagonal x = y, in other words, changing the roles of x and y. 
The shadowlines we obtain will also be flipped, so every x-coordinate we 
would have written into 7; is now a y-coordinate and therefore written 
into Tj and vice versa. This just means that the roles of T; and T> are 
swapped. 


ii) @siderer Banya nye Hh. 
(iii) Consider an increasing subsequence, i.e. 
iy < tg <...%% with m(i1) < m(t2) <... 7 (ix). 
The corresponding set Y = {(i;,7(2;)) | 7 € [A]} is a chain in X(7). Now 
we can use a sub claim from the last theorem which asserted that the 


length of the 7-th row of the tableaux is the length of a longest chain in 
X;, using it for 2 = 1. 


Note that by Lemma 4.11(ii) we have that the number i, of involutions of 
[n] and the number of standard Young tableaux with n squares coincide. The 
following Lemma therefore gives a recurrence for the number of Young tableaux. 


Lemma 4.12. For the number i, of involutions of [n] we have 
ay = 1, 12 = 2, ty, = ae | + (n oo L)in-2. 


Proof. Consider involutions in their disjoint cycle decomposition. The cycles 
contain one or two elements, so think of an involution as an arrangement of n 
labeled balls in unlabeled boxes and one or two balls per box. To count them, 
we distinguish two cases: 


Case 1: The ball with label n is in its own box. The rest is an arrangement 
with n — 1 balls. 


Case 2: The ball with label n is in a box together with another ball x. There 
are n — 1 choices for x and the rest is an arrangement with n — 2 balls. 


This proves the claim. 


4.3.2 Counting Tableaux of the Same Shape 


Let T(A) be the set of all standard Young tableaux with shape A. Take for 
instance 


= TO)={ 


ow 
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In general, let X = (n1,n2,...,%m) where nj > no >... > Nm are numbers 


that sum up to n. With this in mind we define the function t : Z™ — N as 


t( IT ((n1,...;%m))| ifn > ne >... > nm > 0, 
nN1, ->Um) = 
' 0 otherwise. 


For technical reasons, we allow trailing zero-length rows, for instance we could 
write 7 ((3,2,0,0)) = 7((3,2,0)) = T((3,2)) = 5. We claim the function is 
fully characterized by the following identities: 


(1) If the numbers n; fail to be weakly decreasing then 
t(n1,...,%m) = 0. 
(2) If the last number is zero, then it can be removed 
t(m1,---,%m,0) = t(m1,..-,m). 


(3) If the numbers are weakly decreasing and none is zero then 


t(n1,---,%m) = Seidl wees M1, — 1, Ni41,---,Nm) 
= my —1,..-,M%m) +t(nr1,n2—-1,...,Nm) +... 
(4) If only one number n > 0 is left then 
t(n) = 1. 
It is obvious that t fulfills (1), (2) and (4). And (3) is just the observation that 
the biggest number n must be the last number in one of the m rows. Consider 


for instance \ = (5,4,4,3) then in any Young tableau the number n = 16 will 
be in one of the following positions: 


n 


For each case, we count the number of Young diagrams of the shape where the 
corresponding square was removed, i.e. we consider the shapes 


? b o] 


So we compute t(4, 4, 4, 3)+#(5, 3, 4, 3)+¢(5, 4, 3, 3)+t(5, 4, 4,2). You may object 
that the second shape is not valid, but that’s no problem since ¢(5,3,4,3) was 
defined to be zero. 

Note that (1)-(4) gives a complete recursion, i.e. for each input the value is 
either defined explicitly or given in terms of values for smaller inputs (in (2) the 
number of terms is decreased, in (3) the sum of the inputs is decreased). This 
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means there is a unique solution. While we do not try the long and arduous 
journey of discovering the solution ourselves, we can, given the solution, verify 
that it is indeed correct. 

To this end, we need to first examine the Vandermonde determinant which 


is defined’ as 
Al@i se: tm) = II (x; — 2;). 


1<i<j<m 


It has the curious property that swapping two input values changes the sign 
of the output. To see this, observe firstly what happens if the adjacent inputs 
x; and x;41 in the argument list of A are swapped for some i € [m — 1]. All 
factors remain unchanged with the exception of (a; — x41) which is replaced 
by (aj-1 — 2) = —(x; — x;_1) as claimed. If two non-adjacent values x;, 2i+¢ 
are swapped, then this can be simulated by an odd number of swaps of adjacent 
elements. Think of a race with n runners where Alice is in place i and Bob in 
place i+ k. We can make them change places as follows: First Bob overtakes 
k people, putting him in front of Alice and then Alice (who is now in position 
i+1) must be fall behind k—1 positions. This gives 2k—1 overtaking operations 
in total, an odd number. 

We remark (without proof), that the Vandermonde is indeed a determinant, 
namely 


1 @ 2 as 
T: gi ng ae 

A(a1,..-,2m) = (-1)(2) det ; 
1 ayy. ae, cae 


Before we can get back to counting tableaux, we need to prove a technical 
Lemma: 


Lemma 4.13. We have the following identity on polynomials over ©1,%2,...,%m,Y: 
m 

SC eA GL ates ae) = (a1 +a@o+...+ 2m + (3)y) AGiyeia ean): 
k=1 


Proof. Let g be the polynomial given as the sum on the left hand side. Observe 
what happens if we swap the roles of x; and x; (with i < j) in g. The summands 
for k ¢ {i, 7} will just change sign (by our previous observation). The summand 
with k =i turns into: 


BINA i say AO) yo eS Digi chencisy Epis ) EN ag Mec heh oe Ui aalae Ngee 2 
t cw al ia nt t 
position 7 position 7 position 7 position j 


So the term for k = 7 became the negated term with k = 7, and this works vice 
versa as well — altogether, the value of g changes sign if x; and x; swap roles. 
Now consider the case of x; = xj, then swapping x; and x; obviously doesn’t 
change anything. The only number that doesn’t change if its sign changes is 
zero, So g = 0 whenever two x; and x; coincide (for i # 7). 

Now think of all the variables x2,...,%m,y as some (distinct) integer con- 
stants and of x, as the only actual variable, then g is a polynomial of degree 


“Our definition differs from the usual one in that it may have a different sign. 
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nm over x1. It has several zeroes, one of which is 7; = x2. This allows us to 
divide g by the degree 1 polynomial x; — x2 (using polynomial long division) 
and we obtain g = p- (x1 — £2) for some polynomial p. In the same way we 
separate the other zeroes getting g = p’ : (a1 — %2)(a1 — #3)...(41 — Lm) for 
some polynomial p’. Then, looking at p’, we switch perspective thinking of x2 
as the variable and of the other values as (suitable distinct) constants. Looking 
at it that way we find the zeroes v2 = ©3,%2 = @4,...,%2 = Xm Of p’ and can 
separate corresponding factors from p’. We repeat this for the remaining m — 2 
variables as well. With this we eventually obtain: 


g=b- [I (@-2y). (x) 


l<i<j<m 


for some polynomial p. Note that this was the important step. Since this is not 
an algebra lecture, we allowed ourselves to be a bit sketchy: Formally we would 
have to argue that no funny business is going on when doing the polynomial 
long division and switching perspective between thinking of the x; as constants 
or variables. 

Given (x), everything falls into place: The degree of the polynomial g is 
(7) +1 while the degree of the right hand side is (") plus the degree of p. So 
the degree of f is one! 

This means that g is of the form: 


g = (4121 + ag%2Q +... +Am2%m + by) - II (x; — @;) 
1<i<j<m 
= (4101 + Got +... + GmXm + by)A(a1,...,2m). 
for some constants a1,...,@m,0 still to be determined®. But recall that g was 


originally defined to be: 
g= SS we A Gigs Be a Patan). 
k=1 


Since the equation must hold for the special case of y = 0, we quickly see that 
ay ag een am ale 

Now to determine b, we multiply out g and collect the sum of all monomials 
containing y with multiplicity 1. We start by analyzing the k-th summand of g: 


Bee DBs. Ep Yee) 


=an-{ II (2-25) )(a LE — Y)(L2— Le — Y).-- (Le +Y — Lm). 
1<i<j<m 
ijk 


To get monomials with a single y, choose y from one of the factors and choose 
the part without y everywhere else. This allows to reassemble A with the 
exception of a single missing factor (we write it into the denominator). The 


8You may object that p may have a constant term. We call a polynomial homogeneous, if 
all monomials have full degree. Since g and A(#1,...,%m) are homogeneous, we can conclude 
that p must be homogeneous as well. 
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m — 1 summands (one for each occurrence of y) add up to 


; eee —Y —Y y 
Lk II (xi — 25) (4+: bo tot) 


—2£ 
1<icj<m 2.0 eee 
k-1 s m i 
k k 
= yA(@1,...,%m) | — eae ) a 
i=l ~? ko yak ~* 7 
apy 
= yA(a1,-..,2m) ) : 
vi — Xk 


i¢k 


If we sum up these these terms for all k we get 


yA(@1,-.-,2m) S- S- = 


k=1itk a 
_¢ 7; 
=YAGi1,.65 2m) x ( “ ; ) 
; Ui — Lk LE Lj 
1<i<k<m 
LR — Xj m 
= yA(z1,...,2m) Ss £ = Ale.) ( ): 
4 Lp — Xi 2 
1<i<k<m 
This shows 6 = an completing the proof. 
Theorem 4.14. A Jn 
L1,.-.,Lm)n! 
t(ni, eae , Mm) = a a 
U1 LQ+ 6.1. Bm: 
where n = Yr Ni, ay=nt+m-—-i fori=1,...,m andr, >...>%m > 0. 


Proof. We show that the right hand side is a solution to the recurrence we found 
for t, i.e. if we define ¢ by the right hand side then (1) — (4) from page 101 hold. 
First note what happens if x; are not strictly decreasing, i.e. x; = %j41 for some 
i. This gives a value of 0 since the factor 0 = x; — 4441 occurs in A(a1,...,%m). 
We also know: 

nmtm—-t=nqgi1tm—C@4+1 en =ni4zi -1, 


so the n; are not weakly decreasing which means these values do not correspond 
to a valid shape for Young tableaux. Some other invalid shapes were already 
excluded by restricting ourselves to weakly decreasing x;. All in all, we are 
consistent with: 


(1) t(m1,...,%m) = 0 unless ny >... > Mm > 0. 
The second thing we need to show is 
(2) t(ni, -++,Mm-1; 0) = t(ni, sey Nm—1)- 


Since n» = 0 means 2, = 0, we calculate: 


m-1 
A(a1, Pe ,Lm—1,0) = [[@ — X;) = [[@ = xj) II xy 
1<i<j<m 1<i<j<m-1 i=1 
m—1 m—-1 
= A(1,.--,;2m-1) Il xy = A(x, — 1, ce hat —1) II Ly. 
i=1 i=1 
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With this it is easy to verify: 


A(«1, s++y>Um-1) 0)n! 
xr! wes .Lm—1!0! 
A(a1 —1,...,%m-1 —1)n! 


7 (a, —1)!...(@m — 1)! = Ei, bi M2} 


t(m1,---,Mm—1,0) = 


The hard part is (3), but we did most of the work already in the last Lemma. 
(3) 2 cnt SF apes te eyo 


First note: 


- m 
S 0 cpA(ai,-.., 2% —1, tn) = (#4 +0. 4m = (2) Alen, 2m) 
k=1 ee 
Hence 
A(a@1,.-+,;%m)n! na CeA(21,---,%% —1,---,2m)(n — 1)! 
ae l = Ltt l l 
D1++..Lm: Ly+...Lm: 
“ A(ai,..., 2% —1,.-.,2m)(n-DE 
a ee a tole 
= ay!...(a, —1)!... am! = 


The last property is again easy to verify. For m = 1 and n = n1, we have: 


(4) t(m) = A(a1)n! o A(nz)n4! ea 


x1! n4! 


where A(n,) = 1 because it is an empty product. 


With the last Theorem it would already be possible to count Young tableaux 
of any shape fairly conveniently. However, with a geometric interpretation, it 
gets even easier. This geometric interpretation uses hooks. 

Consider a partition A = (n1,...,%m) and a square (7,7) € A, by which we 
mean the square in row i and column j. The hook h,;,; rooted in (i,7) is the 
union of all squares to the right of the square in the same row and below of the 
square in the same column. Here is a picture, a Ferrer diagram where the hook 
hoz is highlighted, it has length 7. 


105 


Theorem 4.15 (Hook length formula). Let \ be a partition of n. Then 


n! 


IT |A,;l- 
oN 


(4,5)€ 


i(A) = 


Proof. Let X = (m1,...,%m). We multiply the lengths of all the hooks, going 
through them row by row. For row i, the first hook h;,; starts at square (m, 1) 
then goes upwards and rightwards and ends at square (i,n;). It has length 
(n; + m— 1%) = a;. The other hooks in row 3 also end in (i,n;) but they start 
in other positions. We go through these positions from left to right as shown in 
the following figure (for i = 3). 


Now if there was a hook starting in each integer position this line passes through, 

the product of the hook lengths would just be x;!.. However, the positions 

marked with a dot do not correspond to starts of hooks so for each dot we 

have to divide by the length that a hook starting there would have. The dot 

in line 7 (j > #) is in position (j7,n; +1) so a hook going to (i,n;) has length 

(j-i + ni — n;) = (a; — 2,;). So the product of all valid hooks for line 7 is 
b 


laaaa Now for the product of all hook lengths of all rows we get: 
j>oii-Z 


Ul aes Ul x;! _ r1!...2Lm)! _ a1!...%m! 414 nt 
a I](@i-2;) [[(@i-2j)  Alai,...,2m) t(A). 
jg>u 


(4,9)EA 1<i<m : i<j 


Take for instance the following Ferrer diagram, that we annotate with the 
lengths of the hooks rooted at the respective position: 


3{1 
1 


NO] C]OUnNT 
RFD) BIO 


The number of Young tableaux of this shape is by the Hook length formula 11! 
divided by all the hook lengths, so: 


11! _ 11-10-9-8 


= =11-10-3-4= 1820. 
7°6-3-1-5-4-1-3-2-2-1 3-2 ee pet 


106 


5 Partially Ordered Sets 


Example 5.1. As was accurately observed by Randall Munroe, creator of xkcd, 
different fruit not only differ in their tastiness, but also in the difficulty of 
preparing them. Some types fruit are clearly superior to others, for instance 


& PEACHES. 
ba omebextes g 
ay Graves 
BLUEBERRIES 
PINEAPPLES 
ite 4... 
Gis BEY 
DIFICULT < | °” EASY 
© w) Bs 
POMEGRANATES BANANAS 
2 
TOMATOES 
GRAPEFRUIT J 
LEMONS y 
UNTASTY 


Figure 27: https: //xkcd.com/388/: Coconuts are so far down to the left they 
couldn't be fit on the chart. Ever spent half an hour trying to open a coconut 
with a rock? — Randall Munroe 


seedless grapes are both more tasty and easier to prepare than oranges. However, 
if you compare pineapples with bananas then pineapples are more tasty but 
harder to prepare, so there is no clear winner. 


To model these situations where some things are bigger /better/above/dominating 
others things but some pairs of things may also be incomparable/on the same 
level/of equal rank, we introduce the concept of partial orders. 


Definition 5.2. A partially ordered set or poset or partial order is a pair 


P = (X,<) where X is a (for us usually finite) set and < is a binary re- 
lation that is: 

(i) reflexive: u<e, Va Ee X. 

(ii) transitive: (e<yandy<z)Sua<z, Va,y,zEX. 


(iii) antisymmetric: («<yandy<2)>a2=y, Vau,yEXx. 


You probably already know some (partial) orders, and recognize a few ex- 
amples from Table 3. Note that a total order (where for any x,y € X we have 
x < yor y < 2) is a special case of a partial order. We now introduce some 
notation. 


elIfx<yanda Fy then we write x < y. 


e Ifa < y we also write y > ax. 
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xX < Example Relations 


natural numbers order by value 23 < 42,42 £ 23 

natural numbers order by divisibility 627,726.13 < 91 
polynomials order by divisibility X—7< X?-10X 421 
words over {A,...,Z} order lexicographically ELEPHANT < MOUSE 
real intervals order by inclusion [3, 5] < [2, 7], [2,3] £ [4, 6] 
real intervals “completely left of” [3,5] £ [2, 7], [2, 3] < [4, 6] 
real valued functions — point-wise domination 1+.2 ¢ x?,sin(x) <2 
subsets of [N] inclusion {1,3} < {1, 2,3, 5} 


Table 3: Some posets P = (X,<) with informal definitions of their order rela- 
tions and examples. 


e If neither x < y nor y < a then x,y are incomparable denoted by x || y. 


e If x < y and there is no z with  < z < y, then x < y is a cover relation 
(or cover for short), denoted by x < y. 


We depict a poset by its Hasse diagram. In it, every x € X corresponds to a 
point in the plane and for every x,y € X with x<y the corresponding points are 
connected by a y-monotone line where «x is the lower (with smaller y-coordinate) 
endpoint. In particular “transitive edges” are omitted meaning x,y € P are not 
directly connected if there is a z such that x < z < y. The poset from Example 
5.1 has the Hasse diagram shown in Figure 28. 


Figure 28: Hasse Diagram for the poset originating from Figure 5.1. For instance 
lemons and bananas are connected by a line since bananas are both tastier and 
easier to prepare. Lemons and cherries are not connected by a direct line since 
they are already connected via bananas. Bananas and watermelons are not 
connected at all since they are incomparable. 


108 


Sometimes it may be easier to define the cover relations than to define the 
entire relation. 


Example. Let X be the set of states of a Rubik’s Cube. For x € X define r(x) 
to be the minimum number of moves needed to solve the cube from state «x. 

We want x < y if and only if r(x) < r(y) and and y can be transformed 
into one another by one move. This is the cover relation of a poset (without 
proof). 


We now define some key concepts and corresponding notation. 
Definition 5.3. Let P = (X,<) poset. 


e A chain is a sequence of elements in increasing order, i.e. yy < yo <... < 
yr for yy € X. It has length k. 


The height of P is the length of a longest chain in P. 


A set Y C X where elements are pairwise incomparable (y; || y2 for all 
yt # ye € Y) is an antichain. 


The width of P is the size of a largest antichain in P. 


The sets of minimal and maximal elements of P are given as 


min(P) :={r#e X |Vye X:a<yorez|| y}, 
max(P):={r@e xX |Vye X:x>yore|| y}. 


Looking at the poset P from Figure 28 again, an example for a chain would 
be {Watermelons, Pears, Strawberries}. There are several longest chains, one 
of which is {Grapefruit, Oranges, Bananas, Plums, Pears, Blueberries, Seed- 
less Grapes}. The height of P is therefore 7. An example for an antichain 
is {Pineapple, Cherries, Plums, Red Apples}. No other antichain is longer so 
the width of P is 4. The maximal elements are max(P) = {Seedless Grapes, 
Peaches} and the minimal elements are min(P) = {Pineapples, Pomegranates, 
Grapefruit, Lemons}. We now study partitions of posets into chains and an- 
tichains. We start with the easier case. 


Theorem 5.4 (Antichain Partitioning). The elements of every poset P = 
(X,<) can be partitioned into h(P) antichains (and not less). 


Proof. Note first that we cannot partition P into fewer antichains: No antichain 
can contain two elements of a chain, since elements of chains are pairwise com- 
parable and elements of antichains are pairwise incomparable. Since P contains 
a chain Y of size h(P), at least h(P) antichains are needed. 

To see that h(P) antichains suffice, first note that min(X) is an antichain: 
Two minimal elements are always unrelated, otherwise the “bigger” minimal 
element would not be minimal at all. Also, every maximal chain in P contains an 
element from min(X): If Y = {yi,..., yx} isa maximal chain with y; <... < yx 
and y; is not minimal, then we would find yo < y; and therefore a longer chain. 

So we take the first antichain to be min(P), then we still need to partition 
X \ min(P) into h(P) — 1 antichains. Since the maximal chains in X \ min(P) 
have size at most h(P) — 1 we can do this by induction. 
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Figure 29: Partition of the poset into 7 antichains obtained by iteratively putting 
the minimal remaining elements into a new antichain. 


To get back to our fruit example, Figure 29 shows how the poset can be 
partitioned into 7 antichains. 


Theorem 5.5 (Dilworth’s Theorem). Every poset P = (X,<) can be parti- 
tioned into w(P) chains (and not less). 


Proof. Clearly, w(P) chains are necessary: P contains an antichain of size w(P) 
and no two of its elements can be contained in the same chain. 

To show w(P) chains suffice, we do induction on |X|. Consider a maximum 
antichain A = {x1,...,% py}. 

The idea is to split P along A into two parts, take for instance our fruit 
poset and A = {Seeded Grapes, Cherries, Plums, Red Apples}, then the two 
parts are shown in Figure 30. 

Formally we define P, = (X1,<) and P, = (X2,<) with elements 


Xp :={yexX |aAreA:a<y},Xo={yeX | Ave Ary <x, 


and the same relations as before. Note that with this definition X;M Xo = A, 
since if y € (X11 Xe) then there is 1 € A and x2 € A with x; < y < a2. Since 
A is an antichain this implies 7} = 72 =y so y€ A. 


Now if we can partition P, into chains C},... Cha and P» into chains 
Ct,...,Cj4) where C} and C? both contains 2; then we can attach the chains 
to one se obtaining a partition of P into chains C1,...,C\4). We can find 


these partitions by induction unless P,; or P2 fail to be smaller than P. 
Convince yourself that P, = P @ A=min(P) and P) = PS A= max(P). 
So our only problem is the case where there is no largest antichain except for 
min(P), max(P) or both. 
In that case, let C' be any maximal chain. In the same way as in the previous 
Theorem we argue that min(P) 1 C #9 and max(P)NC #9. Then P \ C has 
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Figure 30: We split the poset P along an antichain into P, (the upper half) and 
P, (the lower half). The elements of the antichain are contained in both posets 
after the split. 


width w(P) — 1 (since we removed an element from each largest antichain) so 
it can be partitioned into w(P) — 1 chains by induction and we are done. 


5.1 Subposets, Extensions and Dimension 


We now define when a poset is part of another poset, that is, when P = (X,<p) 
is a subposet of Q = (Y,<q). This should be the case if X C Y and Vax1, 22 € 
X : (1 <p tq 21 <Q Z2). In that case we say P is the subposet of Q induced 
by X. But since we do not care about relabeling of elements, the notion of 
subposet is actually a bit more generous: 

P is a subposet of Q if there is an injective function f : X — Y such that 
Va1,02 € X : (a1 <p 2 & f(x1) <Q f(x2)). The function f is an embedding 
and the image of f is a copy of P in Q. 

We say Q = (X, <q) is an extension of P = (X,<p) (note that they use the 
same ground set X) if Vri,%2 € X: 21 <p @2 > 2 <Q £2. When interpreting 
relations as subsets of X x X we could write <p C <q. 

We say L = (X,<z) is a linear extension of P if it is an extension of P and 
a total order (meaning no two elements are incomparable in <z). 


Lemma 5.6. Let P; = (X, <1), Po = (X, <2) be two posets on the same ground 
set X. Then we define P = P,M Pz = (X,<) with the relation < = <1 N <a, 
meaning 

u<syeS(a@<iy anda <2 y). 


We claim P is a poset and P, and Py, are extensions of P. 
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Figure 31: With the posets as shown (via their Hasse diagrams) P is a subposet 
of Q as witnessed by the map wH a,xrtocyHd,z4 e. Actually, there are 
four different embeddings, since w could also be mapped to b and x and y can 
also be mapped to c and d the other way round. These two maps correspond 
to two copies of P in Q, namely {a,c,d,e} and {b,c,d,e}. The poset P’ is not 
a subposet of Q. However, Q is an extension of P’. 


Proof. It is easy to verify that < is an order relation. It is also clear that all pairs 
that are related via < are related via <, and <2, so <; and <2 are extensions 
of <. 


Two-dimensional posets. Recall the poset P from the fruit example. It has 
two natural linear extensions: The first is the total order Lyaste in which the 
fruit are arranged in order of increasing tastiness: 


Lemons <_,,,,. Grapefruit <r,,.,. Tomatoes <pajiu. +++ <Lraste Peaches. 


In other words, Dtaste is the order obtained when projecting all elements to the 
tastiness-axis (note how this is an extension of P and a total order, assuming no 
two fruit have exactly the same tastiness) The second is the total order Lease, 
in which the fruits are arranged in order of increasing ease of preparing them: 


Pineapples <r,,,., Pomegranates <r... +++ <Lpase Deedless Grapes. 


Fruit are ordered if and only if they are ordered according to both linear orders 
so P= Ty ial Lo. 

We want to capture this property in a new notion and define: A poset is 2- 
dimensional if it has a two dimensional picture®, i.e. an assignment f : P > R? 
of positions in the plane to each element such that x < y in P if and only if 
f(y) is above and right of f(x). 


Not every poset is 2-dimensional. Consider the spider poset with three 
legs, the Hasse Diagram is shown on the left of Figure 32. This spider has a 
head H three knees K,, Ke, K3 considered bigger than the head and three feet 
F, Fy, F3 considered smaller than the corresponding knee. There are no further 
relations. 

We try to find a two dimensional embedding into the plane, ie. assign 
positions in the plane to each element (see right side of Figure 32). 

The head H has to go somewhere which partitions the plane into four quad- 
rants (above and right of H, below and right of H, above and left of H, below 
and left of H). The knees must all go above and right of H since they are bigger 
than H. Since knees are incomparable, the ones with bigger x coordinate must 


°Strictly speaking we would have to say: It has a two-dimensional picture but no one- 
dimensional picture, for details refer to the general definition later. 
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Figure 32: On the left: The spider with three legs. On the right: Sketch for the 
proof that the spider is not two-dimensional. 


have smaller y coordinate so the knees must lie on a decreasing curve as shown. 
Without loss of generality, Ky is the second point on this decreasing curve. Now 
observe that there is no suitable point to put Ff: It cannot be in the quadrant 
below and left of H nor in the quadrant above and right of H since H || Fb. 
Since it must also be below and left of K2, it must be in the shaded area. But 
points in that area are below and left of either K, or K3 (or both), which is not 
allowed since F9 || Ky and Fy || K3. This completes the proof. 


Sums of Chains. We define an additional bit of notation. By i@k for i,k € N 
we mean the poset that is the union of a chain of length 2 and a chain of length 
k and no comparabilities between the chains. For instance 1@ 1 is an antichain 
of size 2. The poset important in the following is 162 = -1, let’s label with 
a+]; . It consists of three elements a,b,c with b < c and a || b,a || c. Note that 
-[ has three linear extensions, a << b<c, b<c<a, b<a<_, the last 
of which will have a special standing in the following: Given a linear extension 
L of some poset P and a copy of +! in P then we say L orders this copy of 
‘1 if the lonely element (corresponding to a above) does not come in between 
the other two elements (corresponding to b and c above). Take for instance the 
linear ordering L:a<d<b<c<e< f of following poset (shown on the left). 
It contains 8 copies of -] listed on the right and almost all of them are ordered 
by L. 


y e d f 
F de |; V be A Y Ce [ x Ce 1! Z 
(a e€ d 
a 0 de V d+], rs e+[t rr, ee |" V 


The only exception is cel . It is not obvious whether or not a different linear 
extension would have ordered every copy of -] (you will have to find that out 
on the exercise sheet). What we show now is the significance of the existence of 
such a linear extension. 


Theorem 5.7. A poset P is 2-dimensional if and only if there is a linear 
extension L of P ordering all copies of 1@2 in P. 


Proof. “=” Assume P is 2-dimensional. Then P = LL, Le for the two linear 
extension D,, Lz of P corresponding to an embedding into R?. We claim 
that any of the linear extensions, say L;, orders every copy of °]. 
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Assume not, then we find elements a,b,c in P that form an unordered 
copy of -[, i.e. we have the situation a], with b <p, a <p, c. Since 
b <p c we also have b <;,, c so we are in this situation: 


Ly 


We know from L, that, horizontally, a is between b and c, but vertically it 
must be below 6 (since a || b) and it must be above c (since a || c). Clearly 
this is not possible (it would have to be in both shaded areas at once) 
which contradicts the assumption that L, does not order a copy of °*]. 


Let L be a linear extension of P ordering all copies of +]. From this 
we construct an embedding into R?. First label the elements of P in 
accordance with DL, ie. 21 <p... <p 2p. The x-coordinate of x; will 
just be 7. We define the y-coordinates from left to right, i.e. we define x; 
assuming those of 71,...,%;—1 are already given. 

Considering the y-coordinates of {x, | j < i,x; <p a;} and place x; barely 
above the maximum such y-coordinate. 

Clearly, this procedure ensures that if we have x; <p a; then 2; is put 
above and right of z;. But we also have to ensure that incomparable 
elements are not placed this way. Assume there is a smallest index 7 
where 2; || x; for some 7 < i and the above procedure will still put 2; 
above and right of «;. 


We must have put x; so high for a reason: There must be some x, (k < 4) 
with xz, <p x; and x; was therefore assigned a y-coordinate barely above 
tz. In the drawing above, xz is therefore on one of the horizontal lines, 
either to the left or to the right of z;. It cannot be on the right (i.e. on 
the blue line and j < k < 1) since that would imply z; < a, (otherwise we 
would have made a mistake already earlier, but i was minimal) and because 
of x, < x; and transitivity we would also have x; < x; contradicting our 
assumption of x; || 2;. 

So we have that 2; is to the left of x; (ie. on the red line and k < J). 
This means xz || zj and therefore *j-[7' is a copy of +I in P. Since in 
L we have x, < x; < 2; it is not ordered by L, a contradiction. 


Higher dimensional posets. There is a natural way to generalize the notion 
from the previous few pages to arbitrary dimension. First note that for d € N 
the set R? becomes an (infinite) poset via the dominance order which simply 
means that for x = (21,...,¢a) € R“, y= (m1,---, ya) € R® we have 


L <dom y > Vi € [d]: xj < yi. 
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We say a poset P is d-dimensional if it is a subposet of (IR¢,<gom) and not a 
subposet of (R4~!, <gom). 


Theorem 5.8. A poset P is d-dimensional if and only if d is smallest number 
of linear extensions of P whose intersection is P. 


Proof. It suffices to show that P is a subposet of R@ if and only if P = fs Li; 
for some linear extensions [,,..., Lq of P. 


“—>” We can assume, without loss of generality, that no two points share a 
coordinate (we always break ties without changing the relations in the 
poset). Then define L; as the order of the elements on the projection of P 
to the 7-th coordinate. This is a set of linear extensions with intersection 


P. 
“<-” Given linear extensions [,,..., Lq we can take these to assign coordinates. 
The coordinate of « € P will be (r1,...,ra) € R¢ where r; is the rank of 


x in the i-th linear extension, ie. r; = |{y € P| y <x, x}I. It is easy to 
see that this gives an embedding of P into R?. 


Next we observe that every poset has a well-defined dimension, in other 
words, every poset can be written as the intersection of d linear orders for some 
large enough d. We call such a set of linear extensions a realizer of P. 

We claim that taking all linear extensions £ = {L | L is linear extension of P} 
works, i.e. P=(),¢, L. It is clear that “C” holds. We leave the other direction 
as an exercise. 

Our next result shows that the dimension does not only exist but is actually 
bounded by the width of the poset. We require some preparation though. 


Definition 5.9. Let P = (X,<) be a poset. 


e For S C X define the open downset D(S) = {ye X | axe S: y < x} and 
the closed downset D[S] = D(S)US. 


e For single elements sets S = {s} we will simply write D(s) and D[s] 
instead of D({s}) and D[{s}]. 


e In the same way define open and closed upsets: 


U(S)={yEeX |dvreS:a<y}, U[S]=U(S)US, 
U(s) =U({s}), U[s] = Ul{s}]. 


Consider the left of Figure 33 for an example. 
Theorem 5.10. For any poset P its dimension is at most its width, i.e. 
dim(P) < w(P). 


Proof. Our goal is to find a realizer on w = w(P) dimensions. 
By Dilworth’s Theorem we can partition P into w(P) chains C},...,Cy. 


Claim. For any chain C' there is a linear extension L of P such that for any 
x €C and « || y we have y <z x. 
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Figure 33: On the left: The open upset of {Cherries, Red Apples, Blueberries} 
is shown in red and the closed downset of {Pineapples Bananas} is shown in 
dashed blue. 

On the right: Sketch for the claim from Theorem 5.10. Consider the chain 
C = {Oranges, Watermelons, Cherries, Pears, Strawberries}. Then L is a linear 
extension that puts the elements from C as late as possible, for instance like this 
(elements from C highlighted): L : Grapefruit, Lemons, Tomatoes, Pineapples, 
Pomegranates, Oranges, Bananas, Red Apples, Watermelons, Plums, Green 
Apples, Seeded Grapes, Cherries, Pears, Peaches, Blueberries, Strawberries, 
Seedless Grapes 


Note first that once we have proved this claim, we have proved the theorem 
since the linear extensions [,,...,L,, we get for the chains C),...,Cy are a 
realizer of P, meaning P = ();_, L;. It is clear that the right side is an extension 
of the left side. Now consider if z || y. Then x € C; for some 7 € [w], and y € C; 
for some j #7. Then we have y <1, # and x <z, y so neither relation occurs 
in DL; NM L; . 

Think of LZ; as a linear extension of P that puts the elements from C; as late 
as possible. A sketch is given on the right of Figure 33. 


Proof of claim. Denote the elements of C by x1 < @g < ... < xp and consider 
their upsets. Note that clearly U(a1) C U(a2) CC... C U(x). 
Now define Xp :-= X \ U[xi], Xj <= U(a;) \ Ulxj4i] (for 1 < j < k) and 
X;, = U(a,) and let P; = (X;,<) be the poset induced by X; (0 <j < k). 
Given a linear extension L; for each P; (0 < j < k) we now define the linear 
extension L of P as: 


L: Lo Uy Ty X2 Lg 2+. XE Ly. 
First, convince yourself that L really is a linear extension of P, i.e. every element 


occurs exactly once and if « <p y then x occurs before y. 
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Now assume x € C and « || y. Say « = a. Then y ¢ U(2) so y is not part 
of any X; for 7 > 1. This means y <y, x] = « as claimed. 


Example 5.11 (Standard Examples). For a positive integer n define the poset 


Sn = ({a1,.--,n,b1,..-,6n}, <) where a; < 6; iJ and no (non-reflexive) 
relations within {a1,...,an} and {bi,...,b,} respectively. For instance: 
by by be by be b3 by be b3 b4 
Si= 7, = KX S5= RK a= 
ay ay, ag a, a2 a3 a1 ag a3 44 


We call these posets the standard examples. We claim that dim(S,) = n = 
w(S;,) (for n > 2), so the standard examples make the bound from the last 
Theorem tight. 

To see that dim(S,,) > n, consider a realizer of S,, (consisting of linear 
extensions). Since a; || b; for ¢ € [n], some linear extension L in the realizer 
must have b; <z a;. Since every remaining b; is greater than a; they must all 
go right of a; and every remaining a; is smaller than 6;, so they must all go left 
of b;. This means that b; <z, a; for no j # 7%. To reverse all pairs (aj, bi) ie[nj 
therefore requires n linear extensions. 


5.2 Capturing Posets between two Lines 


lo 
T 
S 
h PY O 


Figure 34: On the left we drew five shapes between two horizontal lines. From 
this we obtain a poset (shown on the right) by considering a shape less than 
another shape if it is left of it. 


Imagine 1, and I, are two horizontal lines in the plane. An object O C R? 
is spanned between 1,,/2 if it is contained in the strip in between /, and lo, is 
connected and touches both J, and lo. 

With a set of such objects we associate a poset where the strict relation is 
given as “left of”, meaning that O, < Oz if and only if O; N O2 = 9 and the 
leftmost point of O, is left of the leftmost point of O2. This clearly gives an 
order relation (once we add the reflexive relations). Note that since the objects 
touch the top and bottom line we avoid situations like this 


lg 


ly 


where one might be inclined to say that Oj, is left of O3 is left of Og but Oj is 
not left of Og. Since this is forbidden our “left of” is really consistent with all 
reasonable intuitions one might have about leftness. 
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In the following we ask: What kind of posets can be represented in such a 
way and what shapes are required? 


Observation 5.12 (Straight Lines). If P is 2-dimensional then P can be rep- 
resented by straight segments spanned between the two lines. 

To see this, assume P = L, M Lz for two linear orders Ly, [2. Then place 
the elements of P on J, in the order given by L,, also place them onto J, in the 
order given by Lz and connect the points corresponding to the same element. 
This give a line segment s, for each x € X. 

Now observe: 


u<pyS (a <r, y and x <z, y) > Sz left of sy. 


The reverse holds as well, i.e. any poset represented by straight segments 
spanned between /; and [2 is at most two-dimensional and a realizer is given by 
sorting the segments according to their endpoints on J; and le. 


The second type of object we consider are axis aligned rectangles. Note that 
since those rectangles must be tangential to J, and lz, they are already uniquely 
determined by their leftmost and rightmost x-coordinate. In fact, the setting is 
not “really” two-dimensional and is easily seen to be equivalent to the setting 
of interval orders where: 

An interval order P = (X,<) is given by a set X of open bounded intervals 
in R with (a,b) <p (c,d) iff b <pe. 

We remark that not all interval orders are 2-dimensional but we will not 
prove this until later. We first characterize them in terms of a forbidden sub- 
poset: 


Theorem 5.13 (Fishburn). P = (X,<) is an interval order if and only if 
2@2¢ P, i.e. there is no copy of the poset 11 contained in P. 


Proof. “=” Assume there is a copy of 2 6 2 labeled like this. 


If P could be written as an interval order, then a,b,c,d would be rep- 
resented by intervals (a;,a,), (b1,6,), (ci,¢r), (di,d,) with the following 
properties: 


— 


© a, <p b; (since a <p D) 


— 


e b, <p Cy (since c || p b) 


— 


© cy <p d; (since c <p d) 


— 


e d; <p a, (since a ||p d) 


This requires a; <p b} <R Cy <p dj <p @,, which is clearly not possible. 


“<<” Let P be (2@2)-free. The crucial observation is that for any two elements 
b,d € X we have D(b) C D(d) or D(b) D D(d). Assume not: Then 
D(b) Z D(d) and D(b) Z D(d). This implies 6 || d and the existence of 
a € D(b) \ D(d) and c € D(d) \ D(b). Convince yourself that this means 
that a,b,c,d forms a copy of II (check a < b, c<d,a||c, a|| d, 6 || c) 
contradicting our assumption. 
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Given the observation, we can order the downsets D = {D(x) | x € X} 
by strict inclusion, i.e. @ = Do C Di C D2 C... Cc Dy. 


For x € X we choose the interval (a,,b,) such that 


Cn a ta |e Da} ife¢ max(P) 
p+ if « € max(P). 
Observe that a, < b,. Also, if ¢ <p y then « € D(y) so by < ay and 
conversely, if a || y then  ¢ D(y) so bz > ay and also y ¢ D(x) so 
by > a,. This means (az, b,) 9 (ay, 6,) 4M and neither interval is left of 
the other. 


D={ Do=9 — ,D, = ab, D2 = abc, D3 = abcde, D4 = abcdeg} 
Ne oe Ne eee Nee, eee NL 
D(a)=D(b)=D(c) — D(d) D(e) D(f)=D(9) D(h) 


Figure 35: Example for the construction in Fishburn’s Theorem. The poset on 
the left contains no 2 @ 2. We determine all downsets Do C ... C D4. We pick 
the interval orders as the Theorem suggests, for instance, the interval for d is 
(1,3) since D, is the downset of d and D3 is the first downset containing d. 


We now consider the case where the objects are (open) triangles spanned 
between 1; and lg with the base on J; and tip on lz (see pictures below). 


Theorem 5.14. P is a triangle order if and only if there is a linear extension 
of P that orders all copies of 2 @ 2. 


Here we say a 2 @ 2 is ordered by a linear order LD, if L puts the element of one 
chain both before the elements of the other chain. In other words, if 2 6 2 is 
labeled like this: 

e] ie 

as sc 


then we want eithera <<, b<pc<z_,dorc<zd<za<yz Bb. 


Proof. “=” If P is a triangle order then we claim that taking DL as the order 
of the tips of the triangles from left to right works. Note that L is a 
total order (obviously) and a linear extension of P: If a triangle is left of 
another triangle (T; <p T2) , then in particular its tip is left of the other 
triangle (T, <z T»). 


Now consider a copy of 2 @ 2 with the same labeling as above. 
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wo 


Then a, b,c,d correspond to triangles T,,7,,7., Tq and Ty, is left of Ty. 


Assume for contradiction that the tip of T, is in between the tips of T, 
and T,. Since c || b we need T, to intersect T,, maybe like this: 


Ta Te Ty 


Now there is no way to add Ty: It must be right of JT, but not right of Ty 
which is clearly impossible. 


Assume [ : 21 < a2 <... < @p is a linear extension of P that orders 
all copies of 26 2. We argue that P is a triangle order. We construct a 
triangle T,, for every x € P, given by the horizontal position of the three 
points, namely the tip t, and base (dz, bz). 


The tips should simply be ordered according to L, so tz, = i for each 
i € [n]. We pick the bases of the triangles one after the other, going 
through the elements of P from left to right according to L. 


For x, we define its base arbitrarily. Having defined triangles for 71,..., xx, 
k > 1, we now show how to add T), for x = x41 with the desired rela- 
tionships to the already existing triangles. 


Consider the downset D(x). We choose a; = max{by | y € D(x)}, so as 
far left as possible while still right of every triangle that T, should be right 
of. The right end point b, is chosen somewhere far right. 


Yr U Y2 Y3 § x 
ee 


t 
ay far right 


In the picture we assume D(x) = {y1, yo, y3}, SO Gz = max{by, , bys, by, } = 
by,. This ensures that x >p y implies that T, is right of T, (for any y). 
Also (by choice of the tips) T;, is not left of an existing triangle. However, 
the pictures shows that there might be u ¢ D(a) such that T;, is right of 
T,,. We need to fix this. 


We do this by shifting a, to the left. But remember that a, needs to be 
right of b, for each y € D(a), so as soon as a, would overtake such a by, 
we just take it with us, shifting it to the left as well. Now we need to argue 
that this does not cause T, to be suddenly left of some T,, for y £ u. So 
assume a, and b, are about to overtake the left endpoint a, of T,,, maybe 
like this: 


+ 
shifting b, and az left, 
almost overtaking ay, 
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All Orders 


Convex Orders 


Triangle Orders 


Figure 36: Venn diagram showing the relationships between some types of orders 
we consider here and in the following. 


Now remember: The reason we are shifting is that there is some t £ x 
with T; still left of T;,. Since we almost overtook a,, already, this triangle 
T;, is also left of T,, so t < u (otherwise we would have made a mistake 
earlier). Combining what we know (y < x,t < u,y £ u,t £ x) we see that 
y, x,t,u forms a copy of 2 @ 2 like this: 


yt I 


Since x is the most recent element, it comes last in L, and since L orders 
every copy of 262 we havet <<; u<z y <z x and the picture above was 
actually misleading. In truth we are in this situation: 


Here, shifting a, and b, past a, is no problem as T,, and Ty, will still 
intersect. So we can shift as far as we need which proves the claim. 


As an overview over the results of this section consider the Venn diagram 
in Figure 36. The class of convex and triangle orders are already defined geo- 
metrically as orders arising from convex sets or triangles spanned between lines, 
respectively. But remember that interval orders and orders of dimension at most 
two also arise from shapes in that way: From axis aligned boxes and segments. 
In Theorem 5.15 we will furthermore see that the set of all orders arises from 
y-monotone curves. 

The containments are easy to see: Any straight line is a (degenerated) tri- 
angle and every triangle is convex. To see the inclusion of interval orders in 
triangle orders, observe that if we only allow “boring” triangles that have the 
tip above their base, then triangles intersect iff their bases intersect, so these 
triangles behave like intervals. Alternatively, this relation immediately follows 
from Theorem 5.13 and Theorem 5.14. 

In Lemma 5.16 we show that there is indeed a non-convex three dimensional 
order. We remark without proof that the “small” circle of interval orders al- 
ready contains posets of arbitrary dimension: The interval order defined by all 
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closed intervals with end points in [n] has dimension at least loglogn + ($ + 
o(1)) log log log n. 


Theorem 5.15. Every poset P can be represented by y-monotone curves spanned 
between 1, and lg. 


Proof. Take any realizer of P, ie. P= [1,0 1.M...M Ly for linear orders 
Iy,..., Lx. Assume without loss of generality k > 2 and introduce k — 2 lines in 
between [; and Jj. This gives k lines in total. On the 7-th line we distribute the 
element of P in increasing order according to [;. We then connect all points 
belonging to the same element with a straight segments. 


ordered by Ls: lo 
ordered by L4: 

ordered by Lz: ---: 

ordered by La: ---: 

ordered by Ly: ly 


Now it is obvious that x <z,, y for each 7 if and only if the line for = is left of 
the line for y. 


Lemma 5.16. There is a 3-dimensional order that is not a convex order. 


Proof. Here is one such poset. It is a modified standard example: 


d, dy ds 
Cl C2 C 
bi f boy 63 


a, ag a3 


Assume it can be represented by convex shapes Cy,,Ca,,---, Ca; that are spanned 
between J, and lg. Note that this implies that each such shape C, contains a 
straight line S, that is spanned between J, and lg. 

Since a; || d; for i = 1,2,3 the shapes Cy, and Cg, intersect. So take a 
point x; € Ca, 1 Ca,. Assume without loss of generality that x; has the middle 
y-coordinate. With respect to the straight line from x2 to x3, we have that x 
is either left of it (red area) or right of it (blue) area: 


lg 


by 


Assume 2, is on the left. Then since c; > az and c; > a3, the shape C,, must 
be right of v2 and x3 (since 22 € Ca,, 23 € Ca,), but since c1 < d, it must also 
be left of x, (since x; € Cy, ). The same holds for the straight line S,,. Clearly 
such a straight line does not exist. 

If x; is on the right instead, we run into the same problem with the line Sy, : 
It has to be right of x, but left of x2 and z3. 
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5.3 Sets of Sets and Multisets — Lattices 


We now consider posets whose elements are sets ordered by inclusion meaning 
A< Biff A C B (we will prefer the latter notation). Usually the sets we consider 
are subsets of [n] and the most important example is the Boolean lattice By, 
the family of all subsets of [n]. Take for instance Bs: 


£1,953) 
Vaal 
{1,2} {2,3} {1,3} 
XX 
{2} {1} {3} 


SL 
0 


Two sets are incomparable if they are pairwise not included in one another. For 
instance {1} || {2,3} and {1,2} || {1,3}. Note also that {{1, 2}, {2,3}, {1,3}} is 
a largest antichain in Bs. 

The next Theorem generalizes this observation. 


Theorem 5.17 (Sperner’s Theorem). w(B,) = (rm): 
2 


Proof. Note that two different sets of the same size are never included in one 
another. So taking all k-subsets of [n] (i.e. all subsets of size k) yields an 
antichain of size (‘'). Choosing k = [$] maximizes its size and proves the lower 
bound. 

To prove the upper bound, we introduce a new notion: For a permutation 7 
of [n] we say that 7 meets a set A C [n] if the elements of A form a prefix of 7, 
meaning: 


A= {n(1),7(2),..., a ({A])}- 


For instance, the permutation 7 = 24513 meets {2,4} and {1,2,4,5} but does 
not meet {4,5}. 

Now consider a antichain F. We count the number of permutations that 
meet an element of #. Note that the sets met by a single permutation a form 
a chain, so no 7 can meet several elements of F. 

So clearly the sets {7 | 7 meets A} are disjoint for different A € F and we 
have: 

> \{7 | w meets A}| <n! 
ACF 


For any given A C [n] there are |A|!(n — |A|)! permutations that meet A: We 
know that the first |A| elements of such a 7 are given by A and can be arranged 
in |A|! ways, and the remaining n — |A| elements can be arranged in (n — |A])! 
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ways. So we have: 


S— JAll(n = | Al)! < nl 


ACF 
Ege 
AEF \|A| 
> Eas! 
AcF \[#] 
n 
oS F\l< : 
FIs (pm) 


In the first step we divided both sides by n!, then we used that k = [4] maxi- 


mizes (7) and thus made the term under the sum independent of A. 


Remark. Note that if F actually is a largest antichain in B,, then all inequalities 

used in the above proof must be tight. In particular, each ()4)) must actually 

have been equal to (,%)) = (-n). So only sets of size || or [2] are used in 
SJ FT 2 2 

F. From this it is easy to see that: 


e If nm is even, then F = (2) is the unique largest antichain. 
2 


e If n is odd, then all largest antichains are contained in ( [n] ) U ( [n ). 
LSJ [5 


Next we consider so-called intersecting families. We say F C 2” is intersect- 
ingifVA,BEF: ANBFD. 


Observation 5.18. If F C 2!”! is intersecting, then |F| < 2”~! and this bound 
can be attained. 


Proof. For any element A € ¥, the complement [n] \ A is disjoint from A and 
cannot be in F. So taking complements is an injection that maps elements from 
F to elements not in F. Therefore |F| < 2" — F and thus |F| < 2”7?. 

To attain this bound, consider F, = {A C [n] | « € A} for some fixed 
x € [n]. Clearly this is an intersecting family of size 2"~+. 


So this problem has a fairly straightforward and boring solution. But what 
about the maximum cardinality of an intersecting k-family, i.e. we only allow 
sets of size k? If k = 1, then we can clearly only pick one set (any two different 
sets of size 1 do not intersect). If k = 2, then it is easy to see that n — 1 is best 
possible using the sets {1,2}, {1,3}, {1,4},...,{1,n}. Is the best strategy still 
the “obvious” one in general? It turns out the answer is “yes”, but the proof is 
a bit more involved. 

Theorem 5.19 (Erdés-Ko-Rado). For two integers n and 0 < k < § the 
maximum size of an intersecting k-family of {n] is (oe 

Note that we have the requirement of 2k < n since the problem becomes 
trivial otherwise: Two sets of size bigger than n/2 always intersect. 
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Proof. The lower bound is easy, just take F* = {A C [n] | x € A, |A| = k}, for 
some fixed x € [n]. Clearly, F* is an intersecting family of the desired size. 

For the upper bound we use a similar approach as in Sperner’s Theorem. 
Recall that a circular permutation o of [n] is an arrangement of the numbers 
1,...,n on a circle where we do not distinguish between circular shifts, e.g. 
53142 = 31425. The number of circular permutations of [n] is (n — 1)!. 

We say a circular permutation o meets a set A C [n] if the elements of A 
appear consecutively on o. For instance A = {1,2,4} is met by o = 25341 but 
B = {2,3,4} is not. 

Now fix F to be an intersecting k-family. 


Claim. For any circular permutation o, the number of sets A € F met by a is 
at most k. 


Proof by Picture. Fix one k-set Ag met by o, we draw it in red (here k = 5, 
n = 12): 


Consider a different k-set B € F that is also met by o (here drawn in blue), 
it must intersect Ap. There are 2(k — 1) choices for such a B, (k — 1) for 
intersecting Ag at the “clockwise” side and (k — 1) for intersecting Ao at the 
“counterclockwise” side. But note that if B € F intersects some part of Ao 
until some gap between two consecutive elements on a, then the set B’ that 
intersects the opposite part of Ap starting from that gap cannot be in F (shown 
dashed in the picture). This is because B’M B = (). Here we use that 2k <n 
so B and B’ do not intersect on the opposite side either. This shows that there 
are at most 2(k — 1)/2 = k — 1 sets met by o other than Ao, so at most & in 
total. (end claim) 


Just like in the proof of Sperner’s Theorem, the number of circular permu- 
tations meeting A € F is |A|!(n — |A|)! = &l(n — k)!. We now double count the 
set 

S:= {(A,o) | A € F and a is circular permutation meeting A}. 


First Way: 
IS] = 5° {A € F | o meets A}| < Sok =(n—1)!-k. 


Second Way: 
IS} = So {o|o meets A} = So kl(n—k)! = |F|-kl(n — k)! 


ACF ACF 


Putting this together we obtain: 


(n-—1)lk | (n—1)! _ fal 
lS aoe Go ine aa 
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5.3.1 Symmetric Chain Partition 


We have already seen in Sperner’s Theorem that the width of the Boolean lattice 
B,, is (;24)- Using Dilworth’s Theorem, this means B,, can be partitioned into 
2 
(-2y) chains. We are going to prove a version of this that is stronger in two ways. 
2 
Firstly, we are going to look at a generalization of Boolean lattices. Secondly, we 


find a particularly interesting partition into chains. The next definition captures 
what we mean by “interesting” . 


Definition 5.20. 


e Let P = (X,<) bea poset. Then the rank of x € X is the size of a largest 
chain beneath g, i.e. 


rank(2) = max{|C] | C is a chain with max(C) =a}, we X. 


Consider for instance the following poset in which we annotated each 
element with its rank: 


1 


e Define rank(P) to be the maximum rank of an element of P (this is just 
a new name for the height of the poset). The above poset has rank 4. 


e A poset P is ranked if all maximal chains ending in an element x have the 
same size. The poset shown above is not ranked since there are maximal 
chains of size 2,3 and 4 all ending in the same element. Another way to 
think about it, is that a poset is ranked iff for any cover relation x < y the 
elements x and y have adjacent ranks. 


e A chain C is unrefinable if there is no z € X — C with min(C) < z < 
max(C’) such that CU {z} is a chain. Note that if P is ranked, then C is 
unrefinable if and only if it does not skip any rank. 


e A chain C is symmetric if it is unrefinable and 
rank(min(C)) + rank(max(C)) = rank(P) + 1. 
In Boolean lattices this is a very intuitive concept. 


e A symmetric chain decomposition is a partition of the elements of P into 
symmetric chains. Figure 37 shows decompositions of some Boolean lat- 
tices. 


We define B(m1,mz2,...,™m) to be the poset on all submultisets of M = 
{m,-1,m2-2,...,m-k} ordered by inclusion. Here, m; is the repetition number 
of the element 7. The elements 1,2,...,kdo not really need to be numbers, we 
might as well have chosen & instead of 1 and © instead of 2. The elements 
just need to be distinguishable and numbers happen to be convenient. Recall 
that if A, B are multisets with types 1,...,k and repetition numbers ay,..., ax 
for A and bj,...,b, for B then by A C B we mean a; < 5b; for all i. 
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Note that B(1,...,1) = By since in that case M = {1,..., k}. 
Sa’ 


k 

Figure 38 depicts B(1,2,1), a three dimensional poset that is similar to Bs 
except it has an additional “plane” since we can have the element 2 not only 
zero or one times, but also twice. 

There are many other ways to explain what these posets are, maybe you 
prefer one of the following perspectives: 


e The poset B(m1,...,mx) is isomorphic to the poset of all divisors of 
pi: py +...+p,* ordered by divisibility, where p),...,px are distinct 
primes. This is illustrated for B(1,2,1) and p; = 2, po = 3, p3 = 5 on the 
right of Figure 38. 


B(my,..., mx) is the product of chains: 
B(m,,...,Mk) =C KReaiea x Cr 


where each C; is a chain on m;+1 elements. By the product P x Q of two 
posets P and Q we mean the posets with elements {(z,y) | « € P,y € Q} 
and dominance order, so 


(t,y) <pxg (e,y') 3 e <p’ andy <gy’. 


Using particularly easy chains we can view B(m1,..., mx) as the set [m+ 
1] x [m2 +1] x... x [m, +1] with dominance order <pe. 


Note that B(m,,...,m,) is ranked. A set A = {r,-1,...,r,-k} has rank 
rank(A) = rj+...+r,+1 and the rank of the entire poset is rank(B(m1,...,mx)) 
my t+...+¢mp_t+1. 


Theorem 5.21. B(m,,...,m,) has a symmetric chain decomposition. 


Proof. We do induction on k. If k = 1, then B(mj) is a (symmetric) chain of 
length m,+ 1. So the trivial partition works. 

For k > 2, consider the subposet P = B(m,,...,mx_—1,0) of B(my,..., mx), 
consisting of those multisets with repetition number 0 for type k. Clearly P is 
isomorphic to B(m,,...,m,—1) and has, by induction hypothesis, a symmetric 
chain decomposition. 


Figure 37: Symmetric chain decompositions of B,, Bz, B3 and By. 
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{1,2- 2, 3} 


Be 90 
(2-29) {1,23} wae * 


{1,2- 2} pi OR 18 - Te & 
we {2, 3} {19)-. 7 1510 


(2-2 {12} a a 4 
\ Us a {3} \ LR. 5 
{2} {1} 

Gs te 


{} 


Figure 38: Left: The Hasse Diagram of the poset B(1, 2, 1). 
Right: Divisors of 90 = 2! . 3? - 5! ordered by divisibility. 


Now the idea is really simple, see Figure 39. Given a symmetric chain 
composition of P, we first partition B(m,,...,m,) into “curtains” that run 
along the chains in P and then we partition the curtains (which are essentially 
two dimensional grids) into symmetric chains. Now for the formal argument. 
If P= C,U...UCrR is a partition into symmetric chains, then B(m,,...,mxz) = 
Cur, U...U Curg is a partition into R sets where 


Cur; = {AU{j-k} | AGC G,0< 7 < mg}. 


Now focus on one of the chains C = C; with elements A; C Ag C... C A;. The 
corresponding curtain forms a grid of size m,; x | standing on its tip as shown 
in Figure 40. Note that it is not necessarily square. The grid can easily be 
partitioned into unrefinable chains that look like hooks as shown. If mp +1=1 
the last hook will be a chain of size 1, otherwise, the last hook will like straight 
in the picture, either pointing to the top left if 1 > m, +1 or to the top right if 
1< mz +1. We only need to check that all these “hook-chains” are symmetric. 
Consider the first hook 


A, CAogc...C A, ¢ A; U{k} C...C Ay U {my - k}. 


Its minimum has rank |A;| + 1 and its maximum has rank |A;| + m, + 1. Re- 
member that C' was symmetric in P so we had |A;|+1+]|A;|+1 = rank(P)+1 
and get: 


|A;| + 1+ |Aj| +m, +1 = rank(P) +1+m, = rank(B(m,...,mz)) + 1. 


This proves that the first hook is a symmetric chain. Subsequent hooks have 
their minima at higher ranks but the ranks of the maxima are correspondingly 
lower so it is easy to see that they are symmetric as well. 


Figure 37 shows the results of this construction for the Boolean lattices B,, 
Bz, B3 and By. You may want to verify your understanding by constructing 
these symmetric chain partitions yourself. Note that in the case of Boolean 
lattices all curtains have height 2 and will therefore be partitioned into only one 
or two hooks. 
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Figure 39: On the left: Partition of B(2,3) into three symmetric chains. On 
the right: The corresponding partition of B(2,3,5) into three “curtains”. 


Since B, grows exponentially in n, it may be impractical to compute the 
symmetric chain partition of B, explicitly for large n. However, there is a simple 
implicit characterization that allows us to compute for any set A C [n] what the 
other sets in the symmetric chain containing A are. We state it here without 
proof. 

The first step is to associate with A C [n] the characteristic zero-one string 
ca that encodes for each i € {1,...,12} in increasing order whether or not 7 is 
in A. For A = {1,2,7,8, 10,11} C [12] this would be c4 = 110000110110. 

Now, roughly speaking, think of the 0s as opening parenthesis and 1s as 
closing parenthesis, then c,4 is a (not necessarily well-formed) parenthesis ex- 
pression where some matching pairs can be found. In our example this would 
be: 


~ 
110000311011 ~0 
12 3 4 5 6 7 8 9 10 11 12 
The matched positions are those elements that do not vary within the chain. 
The matched 1s form the minimum m of the chain, here m = {7,8, 10,11}, the 
matched Os are the elements missing from the maximum of the chain, so here 
M = [12] \ {4,5,6,9} = {1, 2,3,7,8,10,11,12}. The unmatched positions, here 
{1, 2,3, 12} will vary within the chain and are added from left to right. So in 
the symmetric chain partition of B, the set A will be part of the symmetric 
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Aj21U {mx k} 
 iYyf- 
A, U {2 k} Aj42U {m, k} 


” Lu) a9 ~ 


A, U {k} A-1 U {2- k} 


A, Aj_1U {k} Aj_2U {2 : k} A3U {mz k} 
qe, yy Nea 
Aj_1 Aj_2U {k} ee Ag U {me k} 
Ee, Ee, Sana, 
Aj_9 A3 U {2 ° k} Aj U {mr k} 
—l _— Se Vee 
AUR SGU 
ep ye YF 
As Ag U {k} A, U {2-k} 
a Se 
& A, U{h} 
MG 


Figure 40: The elements of a curtain can be partitioned into symmetric chains 
as shown. 


chain: 
{7,8,10,11} c {1, 7,8, 10,11} C {1, 2, 7,8, 10,11} 
C{1, 2,3, 7,8, 10,11} Cc {1, 2,3, 7,8, 10,11, 12}. 


5.4 General Lattices 


A poset is a lattice if any two elements x,y have a least upper bound, i.e. 


Va,y € P: dz: (2 >a and z>y and (V2: (2 > wand 2’ > y) > 2z< 2’)). 


We write z = xV y and call z the join of x and y (it is necessarily unique). We 
also require that any two elements have a largest lower bound, i.e. 


Va,y € P: dz: (z <a and z<yand (V2': (2 < wand 2’ <y)>2z>2’)). 
We write z = «Ay and call z the meet of x and y (it is necessarily unique). 


Note that this implies that lattices have a unique minimum which we call 0 and 
a unique maximum which we call 1. 
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The following poset P is not a lattice although it has a maximum and mini- 
mum. Note that a and b have no join, the upper bounds of a and b are {c, d, 1} 
but there is no least upper bound (since c and d are incomparable). For similar 
reasons, c and d have no meet. If we modify P by adding an element x as shown 
we obtain a lattice L. In it, we have for instancea Vb=2,cAd=za. 


Note that for two elements x, y of a lattice with x < y we always have r\y =a 
and «Vy = y. This also implies that 1 is the neutral element of the A-operation 
and 0 is the neutral element of the V operation. The Boolean lattices B,, really 
are lattices where \ = M and V = U, since AUB really is the least set containing 
A and B and AN B is the largest set contained in A and B. 

We now consider Young lattices Y(m,n). Its elements are Ferrer diagrams 
with at most m rows and at most n columns ordered by inclusion. Figure 41 
shows Y (2,3). 


7 
ig SS 
\ i 


rh 
~~ 


a 
Sor 
a 


Figure 41: The Young lattice Y (2, 3). 


Young lattices are ranked. Diagrams with s squares have rank s + 1. 
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6 Designs 


Assume you are the leader of a local brewery and have just invented seven 
kinds of new beers. You are pretty sure all of them are awesome but are curious 
whether other people share your opinion. There are some experts, each of which 
can judge 3 beers (beyond that point they become tipsy and you do not trust 
their judgment). 

You want to make sure that each beer is evaluated in contrast to each other 
beer, i.e. for each pair of beers there should be one expert that tries both beers. 
Actually, make sure that there is exactly one expert for each pair, since experts 
are expensive and you don’t have any money to waste. 

So can you assign beers to the experts and meet this requirement? It turns 
out your problem has a solution with seven experts as shown in Figure 42. 


bz 
ba be 


bi by 


Figure 42: The beers b,,...,b7 are represented by points. Each set of beers that 
one expert tries is represented by a straight line or circle. Note that any pair of 
points uniquely determines a line or circle containing that pair. 


In terms of the following definition we have just found a 2-(7,3, 1)-design: 
v = 7 beers, k = 3 beers per expert, and each pair (t = 2) of beers tasted by 
A = 1 of the experts. This particular design is known as the Fano plane. 


Definition 6.1. For numbers t, v,k, A € N, at-(v,k, A) design with point set V 
is a multiset B of sets (“blocks”) of points with 


(i) |[V| =, 
(ii) |B| = k for each B € B, 
(iii) each set of t points is a subset of exactly X blocks. 


A t-(v,k, A) design is also denoted by S(t, k,v) and sometimes called bal- 
anced block design. 


Example 6.2. 


e Designs with k = v, i.e. where each block contains all points, are called 
trivial. This gives a t-(v,v,|6|)-design for any t € [v]. From now on we 
assume k < v. 


e Taking all point sets of size k, ie. B = ('!), yields a t-(v, k, (274) )-design 
for any t € [k] (verify this!). 


e Consider the point set V = F4 \ {0}, so all bit-strings of length four 
with addition modulo 2, except for 0 = (0,0,0,0). We have for in- 
stance (1,1,0,0) + (1,0,1,0) = (0,1,1,0). Now consider the blocks B = 
{{x,y,z} | x,y,z € FS, aty+z=O0}. 
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Note that any pair of points x # y uniquely determines a third point z 
such that their sum is zero, namely z = x + y (note that x = —z in F4 so 
x«#-yandz=x+y#0). So Bisa 2-(v=15,k = 3, = 1)-design. 


6.1 (Non-)Existence of Designs 


As a first step we want to understand under which conditions on \,t,k and v 
designs with those parameters may exist. In other words we are looking for 
necessary conditions on the parameters A,t,k and v. 


Theorem 6.3. If B is a t-(v,k, X)-design then 
(i) the number of blocks in B is |B| = A- (°)/(), 


t 


(it) if I C V is a set of sizei < t, then the number of blocks containing I is 
r= A (C)/GE)- 
% t-i t—i 
Note that fori = t we get r, = as it should be. Also note that the 
fact that every point is contained in r, blocks means that B is an 1r4- 
regular hypergraph (you can safely ignore this remark if don’t know what 
hypergraphs are). 


Proof. (i) We double count the set 
S={(T,B)| Be BT CB,|T|=0. 
Firstly, each set TC V of size ¢ (and there are (") of those) is contained 
in exactly A blocks since that is what a design requires. So |S| = (°).. 


Secondly, each blocks contains C) subsets of size t, so |S| = |B] - ay 


Together we get: |B] - (*) = (')A from which the claim follows. 


(ii) Fix an i-set I and double count the set 


S={C)B)|.PEBTCTC BT =—t}. 


vt 


Firstly, there are ( ve) sets T of size t containing J and for each such T 
there are \ blocks B containing T. This gives |S| = (?7') - 2. 


Secondly, for each of the r; blocks B with I C B there are (sy) sets T’ 
of size t with IC T C B. This gives |S| = ry- (*7'). 


t—t 


Together we obtain (ee) -A=ry- Ca) From this we see that r; actually 


only depends on |J| =i and the claim follows. 


Remark. We consider t = 2 to be the default case. Sometimes a 2-(v, k, ) design 
is just called a (v,k, A)-design. In the other notation the parameter \ = 1 can 
be omitted so an S,(t,k,v)-design is simply a S(t,k,v)-design. In that case we 
also call it Steiner System (hence the “S”). 


Corollary 6.4. If B is a (v,k,X)-design (so t = 2) the results of the last 
Theorem become 


(i) |B| = Ae =r where r is: 
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(i) r=r, =d\244. 


Remark 6.5. For any t-(v, k, \)-design all the numbers we derived above, such 
as |B|,r; (1 <i < t) are integers. In particular, if some choice of parameters 
t,k,v, \ does not yield integers, no design with those parameters exists. 

For example, in any (v,k = 3, = 1)-design, we have r; = ot which is 
integer only if v is odd, and |B| = @=D” Which is integer only if v € {0,1,3,4} 
(mod 6), so together we need v € {1,3} (mod 6). Note that for v = 3 we get 
the trivial design and for v = 7 we get the Fano plane, so things seem to fit so 
far. 

However, the necessary conditions for the existence of designs are not yet 
sufficient. In the following Theorem we derive another necessary condition, this 
time a lower bound on v. 


Theorem 6.6. In any non-trivial t-(v, k, X = 1)-design we have v > (t+1)(k- 
t+1). 


Proof. First observe that for A,B € B with A #4 B we have |AN B| < t—1. 
Otherwise some set T C |AM B| of size t would be contained in both A and B, 
contradicting \ = 1. 

Now choose some Bo € B and a set S of size t+1 such that |SMBo| = t. This 
implies that S' is not contained in any B € B (otherwise the t-set SMBo would be 
contained in both Bo and B). Still each of the t-sets T C S must be contained 
in some block Br. Consider two such blocks Br and Bry:. Each contains k 
elements, t of which are in S and k —¢ are outside of S. The intersection 
Br Br certainly contains TMT’ which is already of size t— 1. This means 
(by the observation in the beginning of the proof) that Br and Br: must not 
have any more further intersections outside of S. 


Bre,aay Bry cay 


<2 


Bya.a,b} Bo = Bya.b,c} 


. 


Figure 43: Consider the case t = 3 and k = 9. After choosing some Bp € B 
we find a set S = {a,b,c,d} of size t+ 1 that intersects Bo in exactly t = 3 
points a,b,c. Now no block may contain S$ (as argued above) but each set 
{a, b, c}, {b, c, d}, {c, d, a}, {d, a, b} must be contained in some block. These four 
blocks must be disjoint outside of S (they already intersect in t — 1 elements 
within S). So we count k — t = 6 elements for each of them plus the elements 
from S, so v > 24+4 = 28. 


This means for each of the blocks Br (and there are t + 1 of them) that 
there are k —t elements outside of By that are not contained in any other Br’. 
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Together with the elements from S this gives 


v>(t+1)+ (4+ 1)(k-t) = (4+ 1)(K-t41). 
Ne eee 
Es k —t for each Br 
So we can conclude, for instance, that no design with parameters ’ = 1, 
t = 10, v = 72 and k = 16 exists (even though the divisibility conditions of 
Theorem 6.3 are satisfied). 
Remark. In a recent, so far unpublished paper!° Peter Keevash shows that for 


any parameters \,t, k there exist t-(v, k, A) designs for sufficiently large v. The 
proof is very sophisticated so we will not include it here. 


In the next Theorem, we find a lower bound to the number of blocks of a 
design. 


Theorem 6.7 (Fisher’s inequality). Let B be a non-trivial (v,k, X)-design (so 
k <v). Then |B| > v. 


Proof. Consider the incidence matrix A with v rows, |6| columns and values 
(dp,B) peV,BEB defined as 


1, peEB, 
a = 
0, otherwise. 


If ap, aq are rows in A, then a, - ae is the number of blocks containing both p 
and q. This value is \ for p# q and r = r; for p = q. So we have 


r +H a we 
r r m r 
A-Ab= 
: : r r 
a Ala \ + 


The determinant is det(A- A?) = rk(r — A)”~1 (verify this!). 

By Corollary 6.4(i) we have r = A¥=+ > X. So the determinant is not zero 
and thus the matrix has full rank, i.e. rank v. This means that the matrix 
A must also have rank at least v. In particular A has at least v columns and 


therefore |B| > v as claimed. 


6.2 Construction of Designs 


So far we talked about necessary conditions for the existence of designs without 
actually constructing any — which is our goal in the following. We have to start 
with a few definitions, though. 


Definition 6.8. 


e A (v,k, X)-design with |B| = v is called symmetric. For those designs the 
matrix A we discussed above is square. 


lnttp://arxiv.org/abs/1401.3665 submitted on 15 Jan 2014. 
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e A symmetric (v,k,1)-design is called projective plane. The Fano plane 
from Figure 42 is such a projective plane with 7 points and 7 blocks 
(“lines”). 


e A (v,k = 3, = 1)-design or, equivalently, a Steiner System with param- 
eters S(t = 2,k = 3,v) is called a Steiner Triple System. 


In the following we write Z,, for the cyclic group with n elements, in other 
words, the numbers modulo n. 


Definition 6.9. Let 2<k<vandX> 1. A (v,k,A)-difference set is a set 
D={d,,...,d,} C Z, such that each element in Z, \ {0} has multiplicity » in 
the multiset {d; — d; | 1,7 € [k],i # j}. In other words, each non-zero number 
can be written as the difference of two numbers from D in exactly ways. 


Example 6.10. The set {1,3,4,5,9} C Zi1 is a (11,5,2)-difference set, since 
each of the numbers can be written as a difference in two ways, as follows: 


Number First Way Second Way 


1 4—3 5-4 
2 3-1 5-3 
3 1-9 4-1 
4 5-1 9-5 
5 9-4 3-9 
6 9-3 4—9 
if 1-5 5-9 
8 1-4 9-1 
9 1-3 3-5 
10 3-4 4—5 


Theorem 6.11. If D = {di,...,dy} is a (v,k, A)-difference set then 
B:={D,1+D,24+D,...,v—1+D} 
is a symmetric (v, k, A)-design. Here, bya+D we mean the set {a+d|de D}. 


Proof. Note that there are k- (k — 1) ways to form terms i — j with i 4 7 and 
i,j € D. On the other hand, each non-zero number can be written in exactly 
ways like this, so \(v — 1) = k- (k — 1) which implies ) = "&=» < k, 

From this we conclude that the blocks we proposed above are all distinct: 
Assume a+ D = 6+ D for a 4 b. Then we have a permutation 7 € 5S, such 
that a +d; = b+ dy (;) for all i € [Kk]. Then a — b = d,(;) — d; for each i € [k] so 
we have k > \ ways to write a — 6 as a difference, contradicting the property of 
the difference set. 

By construction we have v points in total and blocks of size k. The only thing 
left to check is that each pair of two distinct elements x,y € Z, is contained in 
exactly of the blocks. Let d = x — y be the difference of those two elements. 
Then there are exactly \ ways to pick two elements z;,y; € D with difference d 
and for each such pair there is a unique shift a such that x = a2;,;+a,y=yi +a. 
This implies that 7,y € a+ D for exactly \ choices of a. 
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Example. If v is a prime power with v = 3 (mod 4) then 


{a? |a€ Zy,a* £0} 


is a (v,k, A)-difference set with k = vt and A = ves We do not prove this. 


6.3. Projective Planes 


We should mention that projective planes actually come from geometry. There, 
a projective plane of order q is defined as a set of g?-+q+1 points with a family 
of subsets of points (lines) such that (i) each line is of size g+ 1 and (ii) each 
pair of points is contained in a unique line. 

In our setting, projective planes of order g are symmetric (v, &, \ = 1)-designs 
where k = q+ 1. In that case we can indeed conclude 


y IB] CHE re => r=k=q+l1 


so each point is contained in q+ 1 lines and 


Cor.6.4 v-—1 
—  k-1 


=> v=a(qtlqt+l=Pt+aqtl, 


so the number of points matches as well. Before we continue, note the following 
property: 


Claim. Any two lines (blocks) of a projective plane intersect in a unique point. 


Proof of Claim. Consider two distinct lines £1, [2 and some x € L, \ Lz. Then 
consider the family of lines (L,)yexz, where Ly is the unique line containing x 
and y. These lines are all distinct (otherwise we would find some yj, yz contained 
in two lines: Lz and Ly, ), so we found q+ 1 lines that all contain x. This means 
we must have counted L, in the process, otherwise there would be q + 2 lines 
containing x (contradicting r = q+1). We conclude L,M Lz # 0. Clearly 
|Z, L2| < 1, otherwise we would have a pair of points contained in L, and Lg 
(contradicting A = 1). 


Construction of Projective Planes 


In the following construction of a projective plane of order q we use F, to denote 
a field of size q where q is a prime power. 

The points of the plane are equivalence classes of the set X = {(0, 21, £2) | 
x; € Fq,(vo,21,22) A O} where two vectors are equivalent if they are scalar 
multiples of one another. Formally, a point [xo, x1, 22] is the set: 


[vo, 21, t2] = {c- (Xo, 41, 22) | c € Fy \ {Of}. 


Note the same point can be described in different ways, for instance [x9, #1, £2] = 
[2x9, 221, 2x2] (for g 4 2). Since |X| = gq? — 1 and q— 1 elements represent the 
same point (are in the same class), we have — =q°+q+1 points in total, 
as desired. 


For (a0, 41, a2) € F} \ {0} we define the line L({ao, a1, a2]) as 


L([ao, a1, a2]) = {[xo0, £1, £2] | ap%o + 41X21 + A2%2q = O}. 
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Note that lines are well defined, that is the definition respects the equivalence 
classes (either all elements representing a point satisfy the equation of the line 
or neither of them). 

The number of solutions to ao2%9 + @1%1 + G2%2 is q? since, without loss of 
generality, ag £ 0 and thus zp and x, can be chosen arbitrarily and x2 is then 
uniquely determined. Disregarding the solution 2p = x, = 22 = 0 (which does 
not represent any point) we have q? — 1 solutions in X that are part of lines, 
meaning re =q-+1 points are part of the line, as desired. 

Now consider two points [29,271,272] 4 [yo, y1, y2]|. We show that they are 
contained in a unique line. Indeed, if L[ao,a1, a2] contains both, then we have 
ag&o + a4 x1 + agx2 = 0, agyo + aiyi + aoy2 = 0. This is a homogeneous sys- 
tem with two equations and three variables ag, a1, a2, so there exists a solution 
(ao, 41, 42) # 0 and all solutions are of the form c- (ag, a1, a2). All those solution 
triples define the same line, so L[ag, a1, a2] is uniquely determined. 

This concludes the construction (and the verification thereof). 


Remark. It is conjectured that the order of every projective plane is a prime 
power, but no proof is known. 


Theorem 6.12 (Bruck-Ryser-Chowla-Theorem). [fv,k,\ © N with \(v—1) = 
k(k —1) and a symmetric (v,k, X)-design exists then 


e if v is even, then k — X is a square, 


e ifv is odd, then z? = (k—A)x? + (—1)*2 Ay? has a solution (x,y,z) £0. 


6.4 Steiner Triple Systems 


We leave projective planes for now and come back to Steiner Triple Systems, i.e. 
(v,& = 3, = 1)-designs. From Remark 6.5 we know that v € {1,3} (mod 6). 
The following Theorem shows that this necessary condition on the existence is 
also sufficient. 


Theorem 6.13. For each v € N with v € {1,3} (mod 6) a Steiner Triple 
System with v points exists. 


Proof. The construction is different for v = 1 and v = 3 (mod 6). We start 
with the latter case. 

The points are V = Zon+1 x Z3 so v = (2n+1)-3 =6n+3 =3 (mod 6). 
There are two types of blocks: 


type 1: {(a,0), (a, 1), (x, 2)} for each @ € Zon41, 
type 2: {(x, i), (y, i), (“44,4 + 1)} for all « 4 y and each i € Zs. 
We need to show that each pair of elements is contained in a unique block. So 
consider a pair (x,7) 4 (y, 7). 
Case 1: « =y: Then the pair is contained in the block {(,0), (x, 1), (x, 2)} 
(of type 1) and in no other block. 
at+y 


Case 2: 4 # y, i= j: The pair is contained in the block {(z, 7%), (y,7), (45%,i+ 
1)} (of type 2) and in no other block. 
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Case 3: « #y, i147: Assume without loss of generality that 7 = i+ 1 (since 
i,j © Zs we always have this or 7 = i—1). Then the pair is contained 
in the block {(ax, i), (y’,i), (“54,1 + 1)} where y’ = 2y— 2x. The pair is 
contained in no other block. 


This concludes the case of v = 3 (mod 6), we proceed with v = 1 (mod 6). 
The construction and verification is more complicated. 

As point set, choose V = (Zan, x Z3)U{co}. To simplify notation we write x; 
to denote the pairs (a,7) € Zo, x Z3. The element oo is special and we assume 
x+co =o for any x € V. Before we define the blocks of the Triple System, 
we define four types of base blocks first: 


© {00, 01, 02}, 
e {00, 00, 21}, {00, 01, n2}, {00, 02, no}, 
© {00, 21, (—2)1}, {01, v2, (—x)2}, {02, vo, (—x)o}, for each x € [n — 1], 
e {no,%1,(1—2)1}, {m1, v2, (1 — x)2}, {ne, xo, (1 — x)o}, for each x € [n]. 
The blocks of the Steiner Triple System are 
B={ao9+ Bl\ ae {0,1,...,n—1}, B is base block}. 


where the addition of elements is defined coordinate wise, in particular ag + 7; = (a+ 
We need to check that there is a unique block containing u,v with u ¥ v. 
We do not consider all cases here, just two important ones. 


Case 1: u=00,v = (2,7): If « < n—1, then the block is xo + {00,0;, ni+1}. 
If x > n then we find (a — n)o + {00, 0;-1, n:}. 


Case 2: u=2;,v0 = y;: Without loss of generality « < y. Either y— « = 2s 
(the difference is even), then x;,y; is contained in the block (y — s)o + 
{0;-1, 8;,(—s);}. Or y— a = 2s —1 then 2;,y; is contained in the block 
(y — 8)o + {mi-1, $i, (1 — 8)c}. 


6.5 Resolvable Designs 


Consider a game played by three players (like skat). Assume there is a tourna- 
ment for this game, with n players. Each set of three players should play exactly 
A times against each other. There are several time slots during which the games 
are scheduled. Of course every player can only play one game at a time. In an 
optimal schedule all players have a game in each time slot. In the sense of the 
following definition, such a schedule corresponds to a resolvable (n, 3, \)-design, 
where parallel classes correspond to the time slots. 


Definition 6.14. A parallel class of a (v, k, A)-design is a set of disjoint blocks 
forming a partition of the point set V. 

A partition of B into parallel classes is called resolution. A design is resolvable 
if it has a resolution. 
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Note that projective planes are examples for non-resolvable designs since no 
disjoint blocks exist. An example for a resolvable (v = 4,k = 2, = 1)-design 
with the corresponding parallel classes is shown in the next picture, where the 
points are A, B,C, D and the blocks are depicted by edges. 


D Cc D C D C D Cc 
dx — and and < 
A B A B A B A B 


This is a special case of a (v = q?,k = q, \ = 1)-design. Such designs are called 
affine planes and are, as we show now, always resolvable. 


Theorem 6.15. Any (v = q?,k =q,\ = 1)-design is resolvable. 
Proof. The main observation is the following: 


Claim. For each block B and x ¢ B there is a unique block B’ with « € B’ 
and BN B’ = 9. 


Proof of Claim. For each y € B there is a unique block B, such that {x,y} € 


B,. These pate are distinct (since |B, M B| < 1) and all contain x. Because 


of r “ES i= =f _ =q+1 there is exactly one block left that contains x 


that is different from each B,. It is disjoint from B. 


With the claim proved, everything falls into place. Start with a maximal set 
of pairwise disjoint blocks. This set forms a parallel class (this is not obvious, but 
easy to prove). Then, in the next phase, take some other set of disjoint blocks 
not yet considered. This forms a parallel class in the same way. Continue like 
this until all blocks are handled (it is easy to verify that this works). 


6.6 Latin Squares 


In the following we consider a structure that seems unrelated to designs at first, 
but the final Theorem will establish an interesting connection to affine planes. 

A Latin square of order n > 1 is ann xn array filled with numbers from the 
cyclic group Z, = {0,...,2—1} such that each row and each column contains 
each number from Z,, exactly once. Consider for instance the following two 
Latin squares 


1/0/23 1/0/3)2 
0) 1)3/2 3/2/1/0 
A=s73I1 10 B= Ih 
3/2/0/1 2)3/0)1. 


We denote the set of positions containing the number 7 in a Latin square L by 
L(t), in the example above we have for instance B(3) = {(1,3), (2, 1), (8, 4), (4, 2)}. 
Another way to think about Latin squares is to say that they are a partition 
L(0)U...U L(n — 1) of [n] x [n] such that each set L(t) are a set of positions 
such that rooks (from the game chess) placed onto those positions cannot attack 
each other. Every solution to a Sudoku puzzle is a Latin square with n = 9, 
although the reverse is not true: Latin squares do not respect the constraint for 
the 3 x 3 boxes. 

There are many ways to transform a Latin square to similar Latin squares: 
You could permute the rows or columns or “rename” the numbers (swap A(1) 
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and A(0) for instance). At least the latter operation seems uninteresting as 
renaming numbers yields the same partition, just with different labels. To 
capture the idea of “entirely different Latin squares, we propose the notion 
of orthogonality defined in the following. 

Let A,B be Latin squares of order n with entries A = (a;;);,;, B = (bi;)i,7- 
The jurtaposition of A and B is the n x n array where each position simply 
contains the corresponding numbers of A and B, i.e. 


A® B= ((a;j,b;j)), ; with entries in Z, x Zy. 
For instance, with A and B from the example above we get 


9 


3,2 
2,0 
03 
11 


3 


& 


We say A, B are orthogonal if A ® B contains each pair from Z,, x Z, exactly 
once. The two Latin squares above are not orthogonal because some pairs occur 
twice (e.g. (2,0)) and, necessarily, some pairs therefore occur not at all (e.g. 


9 


An example for two Latin squares A and C with A orthogonal to C is 


0 


OC} FR] Ww] db 
rR} Ww! CO} bo 


NO] Wl rR] © 


2 
1 
3 


Note that this notion of orthogonality has little to do with geometric notions 
orthogonality you may be familiar with. Observe the following: 


e If A is orthogonal to B, then B is orthogonal to A. 


e If A is orthogonal to B, then “renaming” numbers in A or B (for instance 
replace every 0 with a 1 and vice versa) preserves orthogonality. 


Remark. For n € {2,6} there are no two orthogonal Latin squares of order n. 
For n = 2 this is easy to see since the only Latin squares with n = 2 are 


They are not orthogonal to each other nor to themselves (only for n = 1 can a 
Latin square be orthogonal to itself). For n = 6 the argument is non-trivial. 

For n € N \ {2,6}, a pair of orthogonal Latin squares of order n exists. The 
proof of this is highly non-trivial, in fact it took over a hundred years to show 
that there is a pair of orthogonal Latin squares of order 10. 


Our goal in the following is to construct a large set A,,...,A, of Latin 
squares of order n that are MOLS (mutually orthogonal Latin squares), meaning 
A; is orthogonal to A; for i  j. 


Theorem 6.16. Let n be a positive integer, r € [n — 1] non-zero and co-prime 
ton, i.e. ged(n,r) =1. Then Li =(r-it+j (mod n));,; is a Latin square. 
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To clarify how Li, looks, consider the example n = 5 and r = 2. “Going 
right” corresponds to +1 and going down corresponds to +2 which gives 


3/4/0/1)2 
O0/1)2/3)4 
I2= 2/3/4]0/1 
4/0/1/2/3 
1)2)3/4)0. 
Proof. We have to show that each row and column contains each number exactly 
once. For rows this is clear, since the i-th row contains r-i+1,r-i+2,...,r-d+n, 
which traverses all numbers modulo n. The column j contains r + j,2r + 
j,..-,;n-r+ 7, all modulo n. Assume two of those numbers are identical, say 


iy-r+ 7 St2-r+ 9 we conclude (71 — i2)-r =0 (mod n). Since ged(n,r) = 1 
(meaning r has an inverse modulo n) we actually had 7; = ig. So no column 
contains a number twice and L’ really is a Latin square of order n. 


Theorem 6.17. [fn is prime, then Li,...,L™~+ aren —1 MOLS of order n. 


Proof. Since n is prime, gcd(n,7) = 1 for all i € [n — 1] so by Theorem 6.16 
[i ,...,£"~1 are Latin squares of order n. 

Now consider two of those squares L?, and L* with r 4 s. We need to show 
that they are orthogonal, so suppose some pair of numbers from Z,, x Z, appears 
in Li @ Ls in positions (7,7) and (k,l). By definition of L? and L% this gives 
the identity: 

(r-it+tj,s-i+j) =(r-k4+lhs-k+1). 


So r-(4—k) =1—j =s8-(i—k) which implies (r — s)(¢ — k) = 0. The numbers 
modulo n form a field and a product can only be zero if one of the factors is 
zero. Since r # s we obtain 7 = k and therefore also | = j. In particular we 
showed that the same pair cannot appear in distinct positions and of L? ® LF, 
so Li, is orthogonal to L* as desired. 


Remark. For a prime power n = p* there is a field F,, = {ag = 0,a1,..-,Qn—1} 
of order n. For it we can define L°r = (a,- a; + Oi )a 9 and generalize Theorem 
6.16 and Theorem 6.17 accordingly. 


Lemma 6.18. For any n > 2 there are at most n— 1 MOLS of order n. 


Proof. Assume we have a set Aj,..., Ax of MOLS. As orthogonality is not 
affected by “renaming” the numbers in a Latin square, we may assume without 
loss of generality that each A; has 0 | 1 | 2 | 20h | n—1 as its first row. Now 
consider the entry at position (2,1). If A is orthogonal to B then ag, 4 bai 
because all pairs (0,0),...,(m—1,n—1) already appear in row 1 of A® B. So 
the entries in (2,1) must be mutually distinct. They must also be non-zero since 
the first column of each A; already contains zero in position (1,1). Therefore, 
we started with at most k < n — 1 Latin squares. 


And now for the theorem that establishes the connection to designs. We will 
only prove the equivalence of (7) and (iv), but we will do so constructively. 


Theorem 6.19. For n > 2 each of the following is equivalent: 


(i) There exist n —1 mutually orthogonal Latin squares of order n. 
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(ti) There exists a finite field of order n. 


(iti) n is a prime power, i.e. n= p*. 


(iv) There exists a (v =n?,k =n, = 1)-design (an affine plane). 


Proof. (4) => (iv). Let Ay,...,An—1 be n — 1 MOLS of order n. We need to 
construct a (v = n?,k = n,A = 1)-design. Recall (from Corollary 6.4) 
that the number of blocks in such a design is necessarily |B| = n(n + 1) 
and from Theorem 6.15 that the design is necessarily resolvable, so it splits 
into n+ 1 parallel classes of n disjoint blocks each (this resolution will be 
apparent). 


The points of the design are [n] x [n], so the set of positions in Latin 
squares of order n. 


Recall how each Latin square A is a partition of the positions [n] x [n] = 
A(0)U...UA(n—1) where A(?) are the positions containing number i € Z,,. 
These will be blocks of the design. In addition, each row R(i) = {(2,7) | 
j € [n]} and each column C(j) = {t,7) | 7 € [n]} is a block. So in total 
the blocks are: 


B= {A,(s) |r € [n— 1], 8 € Zn} U{RY) | 7 € [n]} ULC) | J € [n}}. 


We need to show that each pair of distinct points (i, 7) and (k,l) appears 
in exactly one block. We first show that they are contained in at most one 
block. 


Case 1: 1=k. Both points are contained in the i-th row so both are 
in R(i). They cannot be contained in the same column (otherwise 
they would not be distinct) and they are not both contained in any 
A,(s) since that would mean that the Latin square A, contains the 
number s twice in the i-th row. In particular, no block other than 
R(t) contains both points. 


Case 2: j =I. Similar to Case 1, the points are contained in C(j) and 
in no other block. 


Case 3: i#k,j £1. The points are not in the same row or column so 
in no R(t) or C(j). But assume (for contradiction) that we have 
(i, 7), (k, 1) € Ap(s) NM Az(u) for (r,s) A (t,u). Since the blocks orig- 
inating from the same Latin square are disjoint (since those blocks 
form a partition), we have r # t. So in A, there is the number s 
in both positions (i.e. at (4,7) and (k,l)) and in A; there is u in 
both positions. This means A, @ A; has (s, u) in both positions, con- 
tradicting the fact that A, and A; are orthogonal. So each pair of 
positions is contained in at most one block. 


To see that each pair of positions is contained in at least one block we 
double count the set: 


S = {((,7), (k,l), B) | B € B, (4,9) A (k,)), B contains (7,7) and (k,1)}. 
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Firstly, |S| = S- {B | B contains (i, 7) and (k,1)} 
(I) ACA) 


2 
< . 2 i (" i; (uses “at most one” ) 
(5) ACK) 


Secondly, [S| = 5 {((i, 3), (&,0)) | (4,9) 4 (k, 2), B contains (i,j) and (k,1)} 
B 


(2) =0ea(2)-(0) 


So we realize that the “<” is actually an equality so in particular each 
pair of points is contained in exactly one block. 


(iv) > (1) Assume we have a (v = n?,k = n,A = 1)-design. As argued before 
(Corollary 6.4), there are |6| = (n+ 1)-n blocks in total. Since affine 
planes are resolvable by Theorem 6.15 we get n+ 1 parallel classes B = 
B,U...UB,j+1 consisting of n blocks each. We use two of these parallel 
classes B, and By to find “coordinates” for the n? points and the remaining 
n —1 parallel classes to define MOLS. 


We label the blocks from B, and Bz as B, = {R(1), R(2),..., R(n)} and 
By = {C(1),C(2),...,C(n)}. For i,j € [n], we know that every pair 
of blocks R(i) and C(j) intersects in exactly one point p,;;: Because of 
t = 2 and A = 1, they cannot intersect in two points and because of 
the claim from the proof of Theorem 6.15 they cannot be disjoint. This 
means that the blocks from 6, and By intersect as shown in Figure 44 so 


C(1) C(2) C(3) (4) C(5) 


XD DP DB BR D 
w 


Figure 44: The two different parallel classes 6, and By intersect like this when 
arranging the points accordingly. 


we can interpret the blocks R(i) as rows and C(j) as columns (i,j € [n]). 
To simplify notation, identify the point point p;; of the design with the 
position (7, 7) € [n] x [n] in Latin squares. Then for each | € {3,...,n+1} 
we interpret the n blocks B)(0), B;(2),...,B,(n — 1) of the parallel class 
B, as a Latin square where the positions containing a number r € Z, 
are given by B,(r). This really is a valid Latin square since each B)(r) 
intersects each R(i) and C(j) in exactly one point so each number occurs 
in exactly one row and column. 
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It is also easy to verify that the n — 1 Latin squares we get from 63 to 
Bn41 are mutually orthogonal, since for any distinct s,t € {3,...,n+1} 
and r1,1r2 € Z, we know that |B;(7r1) 9 Bi (r2)| = 1 which just means that 
the pair (r1,r2) € Z, x Z, occurs exactly once in the juxtaposition of the 
Latin square for 6, and B,. 
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